date:20090713

How would I write the two selections each in a single subset command?

1) Two non-overlapping time ranges I want to collect together - before
10AM and after noon. Should be an OR function:

X = subset(A, t1000)  + subset(A, t1200)

2) One range between two defined times like after 10AM and before
noon. Should be an AND function:

X1 = subset(A, t1000)
X = subset(X1, t1200)

Thanks,
Mark

P.S. - The help system seems very difficult for finding this sort of
information!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Anomaly in sample() function

2009-07-13 Thread Wolfgang Polasek

Hi all

Maybe someone knows a way to solve this anomaly in sample():
I like to compute a sample (n=100) with replications from a population  of
2500 units but if I draw repeated samples from it I dont get what seems to
be a representative sample if I look at other partitions of the population.
Enclosed is the population g99 with 4 columns: (units, partition 1 (site),
partition 2 (type), weights);
and my R program.

The problem: Some categories from  partition 2 (type) which I use to check
for representativeness, deviates up to 20 percentage points from the
population.
As criterion I have computed the mean difference and the SD of the relative
frequencies between sample and pop. What mean deviation is to expect?

Thanks for any ideas,
W. Polasek

dimnames(g99)[[1]] =paste(g99[,1])
s1= g99[paste(sample(g99[,1], 100, F, g99[,4])),1:4]
dim(s1)
j2 =table(s1[,3])   #sample density
j2g= table(g99[,3]) #pop density
chisq.test(j2g,j2)

p2=100*j2g / sum(j2g) #rel. frequency in pop
pd=p2-100* j2/sum(j2)  #difference of rel. frequency between pop and sample
round(rbind(j2g, p2, pd),2)
sum(abs(pd));sd(pd) #look for the 'best' representative sample
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 95% Confidence Intervals for AUC - $auc.samples from the Daim Package

2009-07-13 Thread lara harrup (IAH-P)


Hi

I am trying to perform a bootstrap estimate of classification accuracy of a 
logistic regression using the 'Daim' package in r using the code at the bottom 
of this post, this all works great and I get the .632+ misclassification 
accuracy, specificity, sensitivity, AUC etc etc but what I would like is to 
access the list of AUC for each of the bootstrap samples as I need calculate 
the 95% confidence intervals for the AUC of the ROC curve for this data using 
this model. I was hoping to get this from the 2.5% and 97.5% quartiles of the 
AUCs of the 10,000 bootstrap samples that were calculated.

I have set returnSample to TRUE which I think should mean the data from each 
bootstrap is saved and when I run auc(x) I get the following

 auc(x,auc.samples)
$auc.632p
[1] 0.8296788

$auc.632
[1] 0.8302014

$auc.loob
[1] 0.8216521

$auc.app
[1] 0.8455208

$auc.samples
list()

my question is how do I access the $auc.samples list() or write it to a csv 
file or something similar to get the 2.5% and 96.5% quartile range from? or is 
there another way to get the 95% CI for the AUC?

Many thanks in advance

Lara

lara.har...@bbsrc.ac.uk



library(Daim)

mylda - function(formula,train,test){
model - lda(formula,train)

predict(model,test)$posterior[,pos]
}

x-Daim(OUTBREAK2 ~ COHESION.2 + COHESION.23 + ED.2 + PLAND.12 +
PLAND.2 + PLAND.25 + PLAND.26 + ALTITUDE_MEAN + SLOPE_MEAN + LSI,
model=mylda,data=landscape,labpos=1,
control=Daim.control(method=boot,number=1),returnSample=TRUE,
cutoff=0.71)

auc(x)
summary(x)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset range selections

2009-07-13 Thread jim holtman

1)   X - subset(A, (t  1000) | (t  1200))
2)   X - subset(A, (t  1000)  (t  1200))

On Mon, Jul 13, 2009 at 9:47 AM, Mark Knechtmarkkne...@gmail.com wrote:
 How would I write the two selections each in a single subset command?

 1) Two non-overlapping time ranges I want to collect together - before
 10AM and after noon. Should be an OR function:

 X = subset(A, t1000)  + subset(A, t1200)

 2) One range between two defined times like after 10AM and before
 noon. Should be an AND function:

 X1 = subset(A, t1000)
 X = subset(X1, t1200)

 Thanks,
 Mark

 P.S. - The help system seems very difficult for finding this sort of
 information!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R in rearranging equations

2009-07-13 Thread Gabor Grothendieck

rSymPy and Ryacas can do algebraic manipulation.
See
http://rsympy.googlecode.com
http://ryacas.googlecode.com

 library(rSymPy)
Loading required package: rJava
 sympy(var('a b m r y x'))
[1] (a, b, m, r, y, x)
 sympy(solve(a*b*m*y*(1-x)-r*x, x))
[1] [-a*b*m*y/(-r - a*b*m*y)]


 library(Ryacas)
 a - Sym(a); b - Sym(b); m - Sym(m); r - Sym(r)
 x - Sym(x); y - Sym(y)
 Solve(a*b*m*y*(1-x)-r*x, x)
[1] Starting Yacas!
expression(list(x == a * b * m * y/(a * b * m * y + r)))


On Mon, Jul 13, 2009 at 9:23 AM,
stenortsten...@merchantsconnected.org.uk wrote:

 Hi,
 can anyone tell me if R can be used to rearrange very complicated equations
 in terms of one of the variables?
 I have:

 dx/dt = a*b*m*y*(1-x)-r*x

 and, having set:

 dy/dx = 0,

 need to rearrange in terms of x. The problem I have is that I don't know how
 to rearrange equations when the variables are not yet defined (I get
 messages warning me that x etc can't be found). I'm not sure if R can
 actually do this (though I know programs like Maxima can) but ideally the
 solution we get should be in R. I am also aware that this isn't difficult to
 solve by hand, but the remaining equations I have to work through have up to
 double the number of variables.

 Any help will be appreciated!
 Stenort
 --
 View this message in context: 
 http://www.nabble.com/R-in-rearranging-equations-tp24461619p24461619.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ?max (so far...)

2009-07-13 Thread Erich Neuwirth


Belated answer:

A few remarks regarding your questions:

Your running max problem could be solved in the following way:
(which is a soution based o Duncan Murdoch's suggestion,
but a little bit more general.

foldOrbit-function(x,fun){
res-numeric(length(x))
res[1]-x[1]
for (i in 2:length(x)) res[i]-fun(res[i-1],x[i])
res
}


or more generally

applySliding-function(x,fun,winlength=length(x)){
res-numeric(length(x))
for (i in seq_along(x)) {res[i]-fun(x[(max(1,i-winlength+1)):i])}
res
}


foldOrbit(x,max)
will give you the running maxes of vector x.
For max, taking the max of the max of the sequence without the last  
element

and the last element gives the max of the whole sequence.
It also works for min, sum, prod (all these are associative).

applySliding is more general. The second argument is the function you  
want to apply in running mode.
If you do not give the winlength, it will apply the function in  
running mode an give correct result for

nonassociatve functions also.
If you give the winlength, it will only use the last winlength  
elements of the vector.

Examples:
foldOrbit(1:10,max)
applySliding(1:10,max)
applySliding(1:10,max,3)

And now, for something completely different:

You seem to want to combine Excel and R in you work.
Possibly you can make your work easier if you user RExcel,
which is an add-in allowing to use R from within Excel.

Information is available at rcom.univie.ac.at
and there is (half hour long) video demonstrating how to use
R from within Excel.




On Jul 1, 2009, at 10:26 PM, Mark Knecht wrote:

On Wed, Jul 1, 2009 at 12:54 PM, Duncan  
Murdochmurd...@stats.uwo.ca wrote:

On 01/07/2009 1:26 PM, Mark Knecht wrote:


On Wed, Jul 1, 2009 at 9:39 AM, Duncan Murdochmurd...@stats.uwo.ca
wrote:


On 01/07/2009 11:49 AM, Mark Knecht wrote:


Hi,
 I have a data.frame that is date ordered by row number - earliest
date first and most current last. I want to create a couple of new
columns that show the max and min values from other columns *so  
far* -

not for the whole data.frame.

 It seems this sort of question is really coming from my lack of
understanding about how R intends me to limit myself to portions  
of a
data.frame. I get the impression from the help files that the  
generic
way is that if I'm on the 500th row of a 1000 row data.frame and  
want

to limit the search max does to rows 1:500  I should use something
like [1:row] but it's not working inside my function. The idea  
works
outside the function, in the sense I can create tempt1[1:7] and  
the

max function returns what I expect. How do I do this with row?

 Simple example attached. hp should be 'highest p', ll should be
'lowest l'. I get an error message Error in 1:row : NA/NaN  
argument


Thanks,
Mark


SNIP


HighLow = function (MyFrame) {
  temp1 - MyFrame$p[1:row]
  MyFrame$hp - max(temp1) ## Highest p
  temp1 - MyFrame$l[1:row]
  MyFrame$ll - min(temp1) ## Lowest l

  return(MyFrame)
}


You get an error in this function because you didn't define row,  
so R
assumes you mean the function in the base package, and 1:row  
doesn't make

sense.

What you want for the highest so far is the cummax (for  
cumulative

maximum) function.  See ?cummax.

Duncan Murdoch



Duncon,
  OK, thanks. That makes sense, as long as I want the cummax from  
the

beginning of the data.frame. (Which is exactly what I asked for!)

  How would I do this in the more general case if I was looking for
the cummax of only the most recent 50 rows in my data.frame? What  
I'm
trying to get down to is that as I fill in my data.frame I need to  
be

able get a max or min or standard deviation of the previous so many
rows of data - not the whole column - and I'm just not grasping  
how to

do this. Is seems like I should be able to create a data set that's
only a portion of a column while I'm in the function and then take  
the

cummax on that, or use it as an input to a standard deviation, etc.?


What you describe might be called a running max.  The caTools  
package has

a runmax function that probably does what you want.

More generally, you can always write a loop.  They aren't  
necesssrily fast
or elegant, but they're pretty general.  For example, to calculate  
the max
of the previous 50 observations (or fewer near the start of a  
vector), you

could do

x - ... some vector ...

result - numeric(length(x))
for (i in seq_along(x)) {
 result[i] - max( x[ max(1, i-49):i ])
}

Duncan Murdoch



Thanks for the pointer. I'll check it out.

Today I've managed to get pretty much all of my Excel spreadsheet
built in R except for some of the charts. It took me a week and a half
in Excel. This is my 3rd full day with R. Charts are next.

I appreciate your help and the help I've gotten from others. Thanks  
so much.


cheers,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE

Re: [R] Help With Fleiss Kappa

2009-07-13 Thread William Revelle


At 6:48 PM +1200 7/13/09, mehdi ebrahim wrote:

Hi All,

I am using fleiss kappa for inter rater agreement.  Are there any know
issues with Fleiss kappa calculation in R?  Even when I supply mock data
with total agreement among the raters I do not get a kappa value of 1.
instead I am getting negative values.

I am using the irr package version 0.70

Any help is much appreciated.


Mehdi,
  Are you by any chance giving the function the agreement matrix 
rather than the raw data?  The  kappam.fliess function seems to want 
the ratings rather than the agreement matrix.


Bill


--
William Revelle http://personality-project.org/revelle.html
Professor   http://personality-project.org/personality.html
Department of Psychology http://www.wcas.northwestern.edu/psych/
Northwestern University http://www.northwestern.edu/
Attend  ISSID/ARP:2009   http://issid.org/issid.2009/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] zero cells in one variable in logistic regression

2009-07-13 Thread Vito Muggeo (UniPa)


dear anna,
if you are not interested in point estimate and SE of the parameter of 
the aforementioned categorical variable, I believe the conventional 
glm(..,family=binomial) is correct. In particular, the returned deviance 
is reliable and also it is the relevant likelihood ratio test..


hope this helps,
vito


David Winsemius ha scritto:


On Jul 13, 2009, at 5:37 AM, anna.bucharova wrote:



Dear all.
I am sort of beginner with R. I do logistic regression with binomial
response variable and several continuous and categorical variables. In 
one

categorical variable, zero cell occures (2x2 table looks like
7 - 0
23 - 25
This leads to overestimating of odds ratio and inflated confidence 
interval
for odds for given variable. The variable is significant in univariate 
test.
I do not necessarilly need odd ratio, but I need the explained 
deviance by

this variable and I really want to keep this variable in the model. It
probably matters for explained deviance. How to treat this problem?
Thanks for help, Anna Bucharova
--


You might consider glmrob in package:robustbase. See 
http://www.jstatsoft.org/v10/i04


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Heckman Selection MOdel Help in R

2009-07-13 Thread Arne Henningsen

On Mon, Jul 13, 2009 at 11:18 AM, Pathak,
Sauravs.patha...@imperial.ac.uk wrote:
 Dear Arne
 I have gone through the paper and I have tried it at my end, I would really 
 appreciate if you could address the following:

 1. Based upon your suggestion I used the following:

 regmod2 - selection(s ~ age + gender + gemedu + gemhinc + es_gdppc +
    imf_pop + estbbo_m, ln_oy5_1 ~ age+ gender+fearfail+gemedu, adpopdata, 
 method = 2step)
 On trying the above( notice that I have changed heckit to selection in 
 the above command, i get the following error message

 Error in coef.probit(result$probit) :
  could not find function coef.maxLik

That's weird. Which versions of R, sampleSelection, and maxLik do you use?

 Before trying the above I tried the following:

 2. When I tried to do the Heckman selection model in stages , the first run 
 was successful, I mean, using the following:

 myProbit- glm(s ~ age + gender + gemedu + gemhinc + es_gdppc +
 +     imf_pop + estbbo_m, family = binomial(link = probit))
 summary(myProbit)

 I am successful upto this point, but

 3. When I try calculating the IMR using the following:
 adpopdata$IMR-invMillsRatio(myProbit)$IMR1

 I get the error below
 Error in `$-.data.frame`(`*tmp*`, IMR, value = c(2.50039945424535,  :
  replacement has 257358 rows, data has 343251

I guess that you have some NAs in the data so that you have the IMRs
not for all observations but only for the observations witout NAs.

R myIMRs - invMillsRatio(myProbit)$IMR1
should work.

 Is there a code to calculate IMR by hand??

Yes, inside invMillsRatio()
However, why do you want to do this?

 what I see is that the number of rows of IMR calculated and the number
 of rows in the actual data set do not match (may be some missing
 value issues, I am not sure, if it is, how to fix it?) and hence IMR could
 not be added to my original data set, how do I fix this and then proceed
 to get correct IMR to use in my outcome equation  (the OLS stage)

 This is really taking a lot of time, I am working on it for weeks, can
 you please help me kindly, If you wish I can send you the data set as well

Please try to fix it yourself.

Arne


 -Original Message-
 From: Arne Henningsen [mailto:arne.henning...@googlemail.com]
 Sent: 13 July 2009 00:56
 To: Pathak, Saurav; r-help@r-project.org; otoo...@ut.ee
 Subject: Re: Heckman Selection MOdel Help in R

 Hi Saurav!

 On Sun, Jul 12, 2009 at 6:06 PM, Pathak,
 Sauravs.patha...@imperial.ac.uk wrote:
 I am new to R, I have to do a 2 step Heckman model, my selection equation is
 below which I was successful in running but I am unable to proceed further,



 I have so far used the following command

 glm(formula = s ~ age + gender + gemedu + gemhinc + es_gdppc +
     imf_pop + estbbo_m, family = binomial(link = probit))

 My question is
 1. How do i discard the non significant selection variables (one out of the
 seven variables above is non-significant) and calculate the Inverse Mills
 Ratio of the significant variables

 2. I need the inverse mills ratio from the above to run the outcome equation
 model using OLS with the Inverse mills ratio calculated on the basis of the
 above probit as the control in my outcome equation,  hence I need to get the
 IMR (Is there another direct way?)

 3. How can this be done in R using my concept or otherwise does there exist
 another way of doing what I wish to achieve



 On trying

 regmod - heckit(s ~ age + gender + gemedu + gemhinc + es_gdppc +

     imf_pop + estbbo_m, ln_oy5_1 ~ age+ gender+fearfail+gemedu,
 adpopdata,method=2step)



 I get

 Error: could not find function heckit



 Error: could not find function invMillsRatio



 Am I missing out something, do i have to install something apart from R
 also, so far I have used



 install.packages( sampleSelection, repos=http://R-Forge.R-project.org; )

 install.packages(Rcmdr, dependencies=TRUE)



 Even then I am unable to run heckit, please help

 You have to install (only once) and *load* the package before you can use it:
 R library( sampleSelection )

 I suggest that you do NOT use function heckit but function
 selection, because the latter allows you to estimate the model both
 by the 2-step and the 1-step (ML) method (depending on argument
 method).

 Our paper about the sampleSelection package published in the JSS could
 be also helpful for you:
 http://www.jstatsoft.org/v27/i07/

 Arne

 --
 Arne Henningsen
 http://www.arne-henningsen.name




-- 
Arne Henningsen
http://www.arne-henningsen.name

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plm Issues

2009-07-13 Thread Damien Moore

Duh! Thanks and good advice. I was using 2.7.2 because it was, until
recently, the latest version working with RPy (http://rpy.sourceforge.net/).
Also didn't realize plm was still actively developed.

Interesting that since plm now correctly handles diff and lag operations, it
actually breaks with the behavior of lm:

 a=ts(c(1,2,4))
 lm(a~diff(a))
Error in model.frame.default(formula = a ~ diff(a), drop.unused.levels =
TRUE) :
  variable lengths differ (found for 'diff(a)')

To regress a on its difference, one needs the more laborious:
 a=ts(c(1,2,4))
 adata=as.data.frame(cbind(a,diff(a)))
 colnames(adata)=c(a,diffa)
 lm(a~diffa,data=adata)

Call:
lm(formula = a ~ diffa, data = adata)

Coefficients:
(Intercept)diffa
  02

From the R help
Fitting Linear ModelsUsing time series

Considerable care is needed when using lm with time series.

Unless na.action = NULL, the time series attributes are stripped from the
variables before the regression is done. (This is necessary as omitting NAs
would invalidate the time series attributes, and if NAs are omitted in the
middle of the series the result would no longer be a regular time series.)

Even if the time series attributes are retained, they are not used to line
up series, so that the time shift of a lagged or differenced regressor would
be ignored. It is good practice to prepare a data argument by
ts.intersectts.union.html(...,
dframe = TRUE), then apply a suitable na.action to that data frame and call
lm with na.action = NULL so that residuals and fitted values are time
series.


On Sat, Jul 11, 2009 at 10:53 PM, milton ruser milton.ru...@gmail.comwrote:

 The first think one need to do when has a so old version, is update it :-)
 After, if the problem remain, try get help with the colleagues.

 best

 milton

 On Thu, Jul 9, 2009 at 10:58 AM, Damien Moore damienlmo...@gmail.comwrote:

 Hi List

 I'm having difficulty understanding how plm should work with dynamic
 formulas. See the commands and output below on a standard data set. Notice
 that the first summary(plm(...)) call returns the same result as the
 second
 (it shouldn't if it actually uses the lagged variable requested). The
 third
 call results in error (trying to use diff'ed variable in regression)

 Other info: I'm running R 2.7.2 on WinXP

 cheers



 *data(Gasoline,package=Ecdat)
 Gasoline_plm-plm.data(Gasoline,c(country,year))
 pdim(Gasoline_plm)
 **Balanced Panel: n=18, T=19, N=342
 *
 *summary(plm(lgaspcar~lincomep,data=Gasoline_plm**))
 **Oneway (individual) effect Within Model

 Call:
 plm(formula = lgaspcar ~ lincomep, data = Gasoline_plm)

 Balanced Panel: n=18, T=19, N=342

 Residuals :
Min.  1st Qu.   Median  3rd Qu. Max.
 -0.40100 -0.08410 -0.00858  0.08770  0.73400

 Coefficients :
 Estimate Std. Error t-value  Pr(|t|)
 lincomep -0.761830.03535 -21.551  2.2e-16 ***
 ---
 Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

 Total Sum of Squares: 17.061
 Residual Sum of Squares: 6.9981
 Multiple R-Squared: 0.58981
 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981

 ** summary(plm(lgaspcar~lag(lincomep),data=Gasoline_plm))
 **Oneway (individual) effect Within Model

 Call:
 plm(formula = lgaspcar ~ lag(lincomep), data = Gasoline_plm)

 Balanced Panel: n=18, T=19, N=342

 Residuals :
Min.  1st Qu.   Median  3rd Qu. Max.
 -0.40100 -0.08410 -0.00858  0.08770  0.73400

 Coefficients :
  Estimate Std. Error t-value  Pr(|t|)
 lag(lincomep) -0.761830.03535 -21.551  2.2e-16 ***
 ---
 Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

 Total Sum of Squares: 17.061
 Residual Sum of Squares: 6.9981
 Multiple R-Squared: 0.58981
 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981

 *
 *summary(plm(lgaspcar~diff(lincomep),data=Gasoline_plm))*
 *Error in model.frame.default(formula = lgaspcar ~ diff(lincomep), data =
 mydata,  :
  variable lengths differ (found for 'diff(lincomep)')
 *

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variance explained by each predictor in GAM

2009-07-13 Thread Kayce Anderson

Many thanks for the advice David. I would really like to figure out, though,
how to get the contribution of each factor to the Rsq - something like a
Beta coefficient for GAM.   Ideas?
KC

On Sun, Jul 12, 2009 at 5:41 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Jul 12, 2009, at 5:06 PM, Kayce Anderson wrote:

  Hi,
 I am using mgcv:gam and have developed a model with 5 smoothed predictors
 and one factor.

 gam1 - gam(log.sp~ s(Spr.precip,bs=ts)  + s(Win.precip,bs=ts) + s(
 Spr.Tmin,bs=ts)  + s(P.sum.Tmin,bs=ts) + s( Win.Tmax,bs=ts)
 +factor(site),data=dat3)


 The total deviance explained = 70.4%.


 I would like to extract the variance explained by each predictor.  Is
 there
 a straightforward way to do this?  I have tried dropping a term and
 recalculating the model, but the edf's change if there is any correlation
 among variables, thereby making all of the relationships different.  I
 haven't yet figured out how to fix the smoothing terms- I get syntax error
 messages.  Among other variations, I tried, for example,
 log.sp~s(Spr.precip, sp=3.9, fx=TRUE) +...



 ?anova.gam

 Obviously I cannot test this with your dat3. You get an F-statistic for
 each s() term by default and you are referred to saummary.gam for further
 explanation.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] weighted cox ph model

2009-07-13 Thread Bessy


Dear all,

I am struggling with the weighted Efron approximation. I really need some
help.

I am confused by the explanation regarding applying weights to Efron
approximation on the paper of “A Package for Survival Analysis in S” (Terry
M. Therneau, Feb 1999).  In the section 3.3.7 Weighted Cox models, it states
that, for a simple example with 5 subjects in time order and 1 and 2 died at
the same time, the second term of the weighted partial likelihood is either
w1r1/ ( w1r1 + w3r3+ w4r4+ w5r5)
or 
w1r1/ (w2r2+ w3r3+ w4r4+ w5r5)

The author suggests that to compute the Efron approximation one can simply
separately replace the numerator with 0.5 (w1r1 + w2r2) and the denominator
with 0.5w1r1 + 0.5w2r2+ w3r3+ w4r4+ w5r5. 

Does this suggestion refer to the second term (stated above) of the
likelihood function only ?

If so should the first term of the likelihood function should still be 
w2r2/ (w1r1 + w2r2+ w3r3+ w4r4+ w5r5)
or 
w1r1/ (w1r1 + w2r2+ w3r3+ w4r4+ w5r5)?

Then to compute the Efron approximation, should the numerator of the this
likelihood function should be 
0.5 (w1r1 + w2r2)* 0.5 (w1r1 + w2r2), 
i.e.
the likelihood function should be
[0.5 (w1r1 + w2r2)/ (w1r1 + w2r2+ w3r3+ w4r4+ w5r5) ]* [0.5 (w1r1 + w2r2)/(
0.5w1r1 + 0.5w2r2+ w3r3+ w4r4+ w5r5)]?

Since we add the probability to the numerator of the second term, we should
add the probability to the numerator of the first term as well. If 1 and 2
each have the probability of 0.5 for the deaths in the second term they
should also have the probability of 0.5 for the deaths in the first term.
That’s my thinking. If it is wrong, please correct it for me. I will really
appreciate it. 

By the way, I tried another weighted Efron approximation, which can yield
the same result with R if and only if the weights for tied data are the
same.  Otherwise it yields the different results but close to the R’s.

The formula has been attached with this message already.
Could someone give me some hints regarding my formula, please? 

Any suggestions will be sincerely appreciated.

Bessy
http://www.nabble.com/file/p24462598/Weighted%2BEfron%2BApproximation_Nabble_R.doc
Weighted+Efron+Approximation_Nabble_R.doc 
-- 
View this message in context: 
http://www.nabble.com/weighted-cox-ph-model-tp24462598p24462598.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset range selections

Thanks Jim.

How does one search the help system for info on simple logic like this?

On Mon, Jul 13, 2009 at 6:59 AM, jim holtmanjholt...@gmail.com wrote:
 1)   X - subset(A, (t  1000) | (t  1200))
 2)   X - subset(A, (t  1000)  (t  1200))

 On Mon, Jul 13, 2009 at 9:47 AM, Mark Knechtmarkkne...@gmail.com wrote:
 How would I write the two selections each in a single subset command?

 1) Two non-overlapping time ranges I want to collect together - before
 10AM and after noon. Should be an OR function:

 X = subset(A, t1000)  + subset(A, t1200)

 2) One range between two defined times like after 10AM and before
 noon. Should be an AND function:

 X1 = subset(A, t1000)
 X = subset(X1, t1200)

 Thanks,
 Mark

 P.S. - The help system seems very difficult for finding this sort of
 information!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plm Issues

2009-07-13 Thread Gabor Grothendieck

The dyn and dynlm packages can handle time series in lm and glm.
(dyn can also handle many additional regression functions as well)
In the case of dyn just write dyn$lm instead of lm like this:

 library(dyn)
 a - ts(c(1, 2, 4))
 dyn$lm(a ~ diff(a))

Call:
lm(formula = dyn(a ~ diff(a)))

Coefficients:
(Intercept)  diff(a)
  02


On Mon, Jul 13, 2009 at 10:17 AM, Damien Mooredamienlmo...@gmail.com wrote:
 Duh! Thanks and good advice. I was using 2.7.2 because it was, until
 recently, the latest version working with RPy (http://rpy.sourceforge.net/).
 Also didn't realize plm was still actively developed.

 Interesting that since plm now correctly handles diff and lag operations, it
 actually breaks with the behavior of lm:

 a=ts(c(1,2,4))
 lm(a~diff(a))
 Error in model.frame.default(formula = a ~ diff(a), drop.unused.levels =
 TRUE) :
  variable lengths differ (found for 'diff(a)')

 To regress a on its difference, one needs the more laborious:
 a=ts(c(1,2,4))
 adata=as.data.frame(cbind(a,diff(a)))
 colnames(adata)=c(a,diffa)
 lm(a~diffa,data=adata)

 Call:
 lm(formula = a ~ diffa, data = adata)

 Coefficients:
 (Intercept)        diffa
          0            2

 From the R help
 Fitting Linear ModelsUsing time series

 Considerable care is needed when using lm with time series.

 Unless na.action = NULL, the time series attributes are stripped from the
 variables before the regression is done. (This is necessary as omitting NAs
 would invalidate the time series attributes, and if NAs are omitted in the
 middle of the series the result would no longer be a regular time series.)

 Even if the time series attributes are retained, they are not used to line
 up series, so that the time shift of a lagged or differenced regressor would
 be ignored. It is good practice to prepare a data argument by
 ts.intersectts.union.html(...,
 dframe = TRUE), then apply a suitable na.action to that data frame and call
 lm with na.action = NULL so that residuals and fitted values are time
 series.


 On Sat, Jul 11, 2009 at 10:53 PM, milton ruser milton.ru...@gmail.comwrote:

 The first think one need to do when has a so old version, is update it :-)
 After, if the problem remain, try get help with the colleagues.

 best

 milton

 On Thu, Jul 9, 2009 at 10:58 AM, Damien Moore damienlmo...@gmail.comwrote:

 Hi List

 I'm having difficulty understanding how plm should work with dynamic
 formulas. See the commands and output below on a standard data set. Notice
 that the first summary(plm(...)) call returns the same result as the
 second
 (it shouldn't if it actually uses the lagged variable requested). The
 third
 call results in error (trying to use diff'ed variable in regression)

 Other info: I'm running R 2.7.2 on WinXP

 cheers



 *data(Gasoline,package=Ecdat)
 Gasoline_plm-plm.data(Gasoline,c(country,year))
 pdim(Gasoline_plm)
 **Balanced Panel: n=18, T=19, N=342
 *
 *summary(plm(lgaspcar~lincomep,data=Gasoline_plm**))
 **Oneway (individual) effect Within Model

 Call:
 plm(formula = lgaspcar ~ lincomep, data = Gasoline_plm)

 Balanced Panel: n=18, T=19, N=342

 Residuals :
    Min.  1st Qu.   Median  3rd Qu.     Max.
 -0.40100 -0.08410 -0.00858  0.08770  0.73400

 Coefficients :
         Estimate Std. Error t-value  Pr(|t|)
 lincomep -0.76183    0.03535 -21.551  2.2e-16 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Total Sum of Squares: 17.061
 Residual Sum of Squares: 6.9981
 Multiple R-Squared: 0.58981
 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981

 ** summary(plm(lgaspcar~lag(lincomep),data=Gasoline_plm))
 **Oneway (individual) effect Within Model

 Call:
 plm(formula = lgaspcar ~ lag(lincomep), data = Gasoline_plm)

 Balanced Panel: n=18, T=19, N=342

 Residuals :
    Min.  1st Qu.   Median  3rd Qu.     Max.
 -0.40100 -0.08410 -0.00858  0.08770  0.73400

 Coefficients :
              Estimate Std. Error t-value  Pr(|t|)
 lag(lincomep) -0.76183    0.03535 -21.551  2.2e-16 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Total Sum of Squares: 17.061
 Residual Sum of Squares: 6.9981
 Multiple R-Squared: 0.58981
 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981

 *
 *summary(plm(lgaspcar~diff(lincomep),data=Gasoline_plm))*
 *Error in model.frame.default(formula = lgaspcar ~ diff(lincomep), data =
 mydata,  :
  variable lengths differ (found for 'diff(lincomep)')
 *

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the

Re: [R] subset range selections

2009-07-13 Thread jim holtman

You can find the operators with:

?|

This will show you logical operators.  You can do:

?+

to get the others.

On Mon, Jul 13, 2009 at 10:21 AM, Mark Knechtmarkkne...@gmail.com wrote:
 Thanks Jim.

 How does one search the help system for info on simple logic like this?

 On Mon, Jul 13, 2009 at 6:59 AM, jim holtmanjholt...@gmail.com wrote:
 1)   X - subset(A, (t  1000) | (t  1200))
 2)   X - subset(A, (t  1000)  (t  1200))

 On Mon, Jul 13, 2009 at 9:47 AM, Mark Knechtmarkkne...@gmail.com wrote:
 How would I write the two selections each in a single subset command?

 1) Two non-overlapping time ranges I want to collect together - before
 10AM and after noon. Should be an OR function:

 X = subset(A, t1000)  + subset(A, t1200)

 2) One range between two defined times like after 10AM and before
 noon. Should be an AND function:

 X1 = subset(A, t1000)
 X = subset(X1, t1200)

 Thanks,
 Mark

 P.S. - The help system seems very difficult for finding this sort of
 information!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting dataset for Tuning Parameter with Cross Validation

2009-07-13 Thread Max Kuhn

Here is what the train() function in the caret package does by default
(you can change this behavior; see below).

Using the entire data set, estimate the RBF parameter using the
sigest() function in the kernlab package (which, if I recall correctly
involves the median of a sample of kernel matrix values).

Using this fixed value, the cost function is varied over a common set
of held-out samples. More specifically, every value of the cost
parameter is evaluated on the same exact folds. I've been able to
achieve pretty good performance this way in almost every case where
I've done the comparison,

Based on these performance values, you can select the cost function
based on the best performance. There are also ways of selecting the
simplest model that is within the uncertainty of the numerically
optimal model (that is done using the selectionFunction argument of
trainControl).

I should also note that you can tune across any grid of cost and sigma
(this is done via the tuneGrid argument of train()).

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issues with file.info?

2009-07-13 Thread Jason Rupert


Yes.  You are right that is the trick:)

Here is what I'm seeing:
input_path-c(C:/WINDOWS/system32)
file.info(list.files(input_path))


size isdir mode mtime ctime atime  exe
$winnt$.inf   NANA NA  NA  NA  NA NA
1025   NANA NA  NA  NA  NA NA
...
xsi.jar  NA NA NA  NA NA NA NA
xsi.zip  NA NA NA  NA NA NA NA
zipfldr.dll NA NA NA NA NA NA NA

input_path-c(C:/WINDOWS/system32/winmine.exe)
file.info(input_path)

  size isdir mode   mtime   
ctime   atime   exe
C:/WINDOWS/system32/winmine.exe 119808 FALSE  777 2002-12-31 07:00:00 
2008-08-15 14:28:37 2009-07-13 09:39:32 win32

Not sure why, file.info is not working for the array of input 
files/folders/etc. but successfully works when an individual file in input.  

Thanks again for your insights and feedback. 



--- On Thu, 7/9/09, Rolf Turner r.tur...@auckland.ac.nz wrote:

 From: Rolf Turner r.tur...@auckland.ac.nz
 Subject: Re: [R] Issues with file.info?
 To: Jason Rupert jasonkrup...@yahoo.com
 Cc: R-help@r-project.org R-help@r-project.org
 Date: Thursday, July 9, 2009, 3:13 PM
 
 On 10/07/2009, at 6:02 AM, Jason Rupert wrote:
 
  
  Are there any tricks associated with file.info?
  
  I just tried it on a directory folder and it returned
 NA for all fields for all files.  I tried it on a
 different folder with different files and it still returned
 NA.
  
  I tried it on a specific file and it returned all the
 proper info correctly.
  
  Just wondering if there are any tricks I've
 overlooked.
 
 Yes.  You've overlooked the trick of telling us about
 the specifics of
 your operating system, version of R, etc., and of showing
 exactly what
 commands you ``tried'', i.e. of reading the Posting Guide.
 
     cheers,
 
         Rolf Turner
 
 
 
 ##
 Attention:This e-mail message is privileged and
 confidential. If you are not theintended recipient please
 delete the message and notify the sender.Any views or
 opinions presented are solely those of the author.
 
 This e-mail has been scanned and cleared by
 MailMarshalwww.marshalsoftware.com
 ##
 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nonlinear Least Squares nls() programming help

2009-07-13 Thread Dieter Menne




MathZero wrote:
 
 Hi, I am trying to use the nls() function to closely approximate a vector
 of values, colC and I'm running into trouble.  I am not sure how if I am
 asking the program to do what I think its doing, because the same
 minimization in Excel's Solver does not run into problems.  
 
 

Fitting a function is an approximation, trying to find a minimum. Think of
frozen mountain lake surrounded by mountains. Excel's Solver will report the
highest tip of the snowflake on the lake, if it finds it. nls will find out
that the lake is essentially flat compare to the surrounding and tell you
this fact in unkind word.

In your case, I assume there is an (additional?) problem that the solution
might be non-identifyable. It could required additional constraints, for
example that one coefficient is always larger than the other.

Dieter


-- 
View this message in context: 
http://www.nabble.com/Nonlinear-Least-Squares-nls%28%29-programming-help-tp24445472p24463097.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] adjusting survival using coxph

2009-07-13 Thread Christopher Jones

I have what I *think* should be a simple problem in R, and hope  
someone might be able to help me.

I'm working with cancer survival data, and would like to calculate  
adjusted survival figures based on the age of the patient and the  
tumour classification. A friendly statistician told me I should use  
Cox proportional hazards to do this, and I've made some progress with  
using the coxph function. However, there doesn't seem to be a simple  
way to get adjusted survival figures, and I'm still a bit unsure  
whether I'm using the function correctly.

I've found plenty of examples of how to plot survival curves for a  
single catagorical variable, like tumour stage. However, its a bit  
less obvious how to properly combine multiple variables (like stage  
and grade) - examples I've found aren't very helpful. For instance,  
can someone explain what these two function calls are doing differently?

   cox.model1 - coxph(Surv(time,outcome)~stage+grade,method=breslow)
   cox.model2 - coxph(Surv(time,outcome)~stage+grade 
+stage:grade,method=breslow)

Also, am I right to think that this is how I would plot the survival  
curves for stage in a 50 year old patient?

   cox.model3 - coxph(Surv(time,outcome)~stage+I 
(age-50),method=breslow)
   plot(survfit(cox.model3,newdata=data.frame(stage2=c 
(III,IV),age=c(50,50

Finally, assuming I manage to correctly call the coxph function in  
the first place (!), what is the sensible way to adjust the survival  
for each patient? My understanding of what I'm trying/hoping to do  
has led me down the following dark alley;

   1) pick a sensible centre point (median age, modal stage/grade) as  
my 'standard' case
   2) obtain the survival probabilities for this 'standard' case from  
the Cox model
   3) for each patient find the probability of their observed  
survival, for their specific age/stage/grade
   4) the adjusted survival is the survival time from the 'standard'  
case that matches the observed survival probability

This seems long-winded, and for all I know might be completely the  
wrong way to do it.  hopefully someone will be able to offer some  
friendly advice!

Apologies for the long-ish post, and thanks in advance for any help.


Chris Jones.


---
Gynaecological Cancer Research Laboratories,
UCL EGA Institute for Women's Health,
University College London,
Paul O'Gorman Building,
72 Huntley Street,
London
WC1E 6DD
United Kingdom

Telephone; 020 3108 2007
Fax; 020 3108 2010


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset range selections


On 7/13/2009 10:21 AM, Mark Knecht wrote:

Thanks Jim.

How does one search the help system for info on simple logic like this?


Look in the manuals (the pdf ones), rather than the man pages.  For 
example, expressions like the ones below are in the Introduction to R, 
section 2.4, Logical Vectors.  (The table of contents contains Logical 
vectors, within Simple manipulations.)


Duncan Murdoch



On Mon, Jul 13, 2009 at 6:59 AM, jim holtmanjholt...@gmail.com wrote:

1)   X - subset(A, (t  1000) | (t  1200))
2)   X - subset(A, (t  1000)  (t  1200))

On Mon, Jul 13, 2009 at 9:47 AM, Mark Knechtmarkkne...@gmail.com wrote:

How would I write the two selections each in a single subset command?

1) Two non-overlapping time ranges I want to collect together - before
10AM and after noon. Should be an OR function:

X = subset(A, t1000)  + subset(A, t1200)

2) One range between two defined times like after 10AM and before
noon. Should be an AND function:

X1 = subset(A, t1000)
X = subset(X1, t1200)

Thanks,
Mark

P.S. - The help system seems very difficult for finding this sort of
information!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issues with file.info?


On 7/13/2009 10:42 AM, Jason Rupert wrote:

Yes.  You are right that is the trick:)

Here is what I'm seeing:
input_path-c(C:/WINDOWS/system32)
file.info(list.files(input_path))


size isdir mode mtime ctime atime  exe
$winnt$.inf   NANA NA  NA  NA  NA NA
1025   NANA NA  NA  NA  NA NA
...
xsi.jar  NA NA NA  NA NA NA NA
xsi.zip  NA NA NA  NA NA NA NA
zipfldr.dll NA NA NA NA NA NA NA

input_path-c(C:/WINDOWS/system32/winmine.exe)
file.info(input_path)

  size isdir mode   mtime   
ctime   atime   exe
C:/WINDOWS/system32/winmine.exe 119808 FALSE  777 2002-12-31 07:00:00 
2008-08-15 14:28:37 2009-07-13 09:39:32 win32

Not sure why, file.info is not working for the array of input files/folders/etc. but successfully works when an individual file in input.  


Look at what list.files() returns:  there is no path, just the filename. 
 If you want to use file.info, you need the full path, so you'd use


file.info(list.files(input_path,full=TRUE))

(That takes a long time, by the way:  there are a lot of files there, 
and they all need to be opened for inspection!)


Duncan Murdoch


Thanks again for your insights and feedback. 




--- On Thu, 7/9/09, Rolf Turner r.tur...@auckland.ac.nz wrote:


From: Rolf Turner r.tur...@auckland.ac.nz
Subject: Re: [R] Issues with file.info?
To: Jason Rupert jasonkrup...@yahoo.com
Cc: R-help@r-project.org R-help@r-project.org
Date: Thursday, July 9, 2009, 3:13 PM

On 10/07/2009, at 6:02 AM, Jason Rupert wrote:

 
 Are there any tricks associated with file.info?
 
 I just tried it on a directory folder and it returned

NA for all fields for all files.  I tried it on a
different folder with different files and it still returned
NA.
 
 I tried it on a specific file and it returned all the

proper info correctly.
 
 Just wondering if there are any tricks I've

overlooked.

Yes.  You've overlooked the trick of telling us about
the specifics of
your operating system, version of R, etc., and of showing
exactly what
commands you ``tried'', i.e. of reading the Posting Guide.

cheers,

Rolf Turner



##
Attention:This e-mail message is privileged and
confidential. If you are not theintended recipient please
delete the message and notify the sender.Any views or
opinions presented are solely those of the author.

This e-mail has been scanned and cleared by
MailMarshalwww.marshalsoftware.com
##






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] math expressions in graphs

2009-07-13 Thread Rafael Moral

Dear useRs,

I want to put a math expression in the ylab in my plot which should be a Delta 
and a 'y' with a trace and a hat above it and a 't' as subscription.
How could I manage to do it?

Thanks in advance,
Regards,
Rafael.


  

[[elided Yahoo spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variance explained by each predictor in GAM

2009-07-13 Thread Simon Wood

You can get some idea by doing something like the following, which compares 
the r^2 for models b and b2, i.e. with and without s(x2).  It keeps the 
smoothing parameters fixed for the comparison. (s(x,fx=TRUE) removes 
penalization altogether btw, which is not what was wanted). 

dat - gamSim(1,n=400,dist=normal,scale=2)
b-gam(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat)
b2-gam(y~s(x0)+s(x1)+s(x3),sp=b$sp[-3],data=dat)
summary(b2)$dev.expl
summary(b)$dev.expl


On Monday 13 July 2009 15:09, Kayce Anderson wrote:
 Many thanks for the advice David. I would really like to figure out,
 though, how to get the contribution of each factor to the Rsq - something
 like a Beta coefficient for GAM.   Ideas?
 KC

 On Sun, Jul 12, 2009 at 5:41 PM, David Winsemius 
dwinsem...@comcast.netwrote:
  On Jul 12, 2009, at 5:06 PM, Kayce Anderson wrote:
 
   Hi,
 
  I am using mgcv:gam and have developed a model with 5 smoothed
  predictors and one factor.
 
  gam1 - gam(log.sp~ s(Spr.precip,bs=ts)  + s(Win.precip,bs=ts) + s(
  Spr.Tmin,bs=ts)  + s(P.sum.Tmin,bs=ts) + s( Win.Tmax,bs=ts)
  +factor(site),data=dat3)
 
 
  The total deviance explained = 70.4%.
 
 
  I would like to extract the variance explained by each predictor.  Is
  there
  a straightforward way to do this?  I have tried dropping a term and
  recalculating the model, but the edf's change if there is any
  correlation among variables, thereby making all of the relationships
  different.  I haven't yet figured out how to fix the smoothing terms- I
  get syntax error messages.  Among other variations, I tried, for
  example,
  log.sp~s(Spr.precip, sp=3.9, fx=TRUE) +...
 
  ?anova.gam
 
  Obviously I cannot test this with your dat3. You get an F-statistic for
  each s() term by default and you are referred to saummary.gam for further
  explanation.
 
  David Winsemius, MD
  Heritage Laboratories
  West Hartford, CT

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented, minimal,
 self-contained, reproducible code.

-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variance explained by each predictor in GAM

2009-07-13 Thread David Winsemius

It appears you are conflating beta coefficients (individual covariate  
effect measures) with overall model fit measures. Beta coefficients  
are not directly comparable to R-squared measures in ordinary least  
squares analyses, so why would they be so in gam models?


I cannot tell whether you actually looked at anova.gam or consulted  
Wood's book (which is really the better place to go for advice  
regarding mgcv function questions. You were offered an apportioned F- 
statistic for each covariate smooth. How was that not satisfactory?  
What is your goal in this search?


--
DW

--
On Jul 13, 2009, at 10:09 AM, Kayce Anderson wrote:

Many thanks for the advice David. I would really like to figure out,  
though, how to get the contribution of each factor to the Rsq -  
something like a Beta coefficient for GAM.   Ideas?


KC

On Sun, Jul 12, 2009 at 5:41 PM, David Winsemius dwinsem...@comcast.net 
 wrote:


On Jul 12, 2009, at 5:06 PM, Kayce Anderson wrote:

Hi,
I am using mgcv:gam and have developed a model with 5 smoothed  
predictors

and one factor.

gam1 - gam(log.sp~ s(Spr.precip,bs=ts)  + s(Win.precip,bs=ts) +  
s(

Spr.Tmin,bs=ts)  + s(P.sum.Tmin,bs=ts) + s( Win.Tmax,bs=ts)
+factor(site),data=dat3)


The total deviance explained = 70.4%.


I would like to extract the variance explained by each predictor.


You may need to define your terms and there may not be an analog

Is there
a straightforward way to do this?  I have tried dropping a term and
recalculating the model, but the edf's change if there is any  
correlation

among variables, thereby making all of the relationships different.  I
haven't yet figured out how to fix the smoothing terms- I get syntax  
error

messages.  Among other variations, I tried, for example,
log.sp~s(Spr.precip, sp=3.9, fx=TRUE) +...



?anova.gam

Obviously I cannot test this with your dat3. You get an F-statistic  
for each s() term by default and you are referred to saummary.gam  
for further explanation.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] object .trPaths not found

2009-07-13 Thread Knut Krueger


Uwe Ligges schrieb:




I have to admit that I have no idea what we are talking about here 
(yes, I tend to forget many things these days) - and you have not 
cited the original message, unfortunately (nor have you specifies R 
versions, Tinn-R versions and both OS versions, but just one) ...


Best wishes,
Uwe

Original question
rkevinbur...@charter.net wrote:
I am running an R script with Tinn-R (2.2.0.1) and I get the error 
message


Error in source(.trPaths[4], echo = TRUE, max.deparse.length = 150) : 
  object .trPaths not found


Any solutions?

System

R 2.8.1
Tinn-R 2.2.02
windows Xp professional Servicepack 3

This combination works on my Pc without problems
And the code is working on the other system with Tinn-R problems
with the R-Editor without problems
so it seems to be a Tinn_R problem

but maybe anybody has a solution for that.

Regards Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] math expressions in graphs

2009-07-13 Thread Jorge Ivan Velez

Dear Rafael,
Try the following which seems to be close to what you asked for:

plot(1, ylab = expression( Delta ~ hat(y) [t] ) )


HTH,

Jorge



On Mon, Jul 13, 2009 at 10:56 AM, Rafael Moral
rafa_moral2...@yahoo.com.brwrote:

 Dear useRs,

 I want to put a math expression in the ylab in my plot which should be a
 Delta and a 'y' with a trace and a hat above it and a 't' as subscription.
 How could I manage to do it?

 Thanks in advance,
 Regards,
 Rafael.



  
 
 [[elided Yahoo spam]]

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Lattice xyplot: same scales within one factor

2009-07-13 Thread OB

I am using R 2.8.1 and lattice to produce xyplots conditioned on
two factors. What I would like is to have the scales be free between values
of one factor, but some within. Thus, in this example,

xyplot(mpg ~ disp | factor(gear) + factor(cyl), mtcars,
    scales=list(x=list(relation=free)))

rather than having the x scales be free within a gear as well, I want it to
be the same for the gear, but free between cyl (and thus only have x scales
below the bottom panels with no scales or white space in the middle between
panels).

Any help would be greatly appreciated!

-Orion

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] dse model setup help

2009-07-13 Thread Bob McCall

I tried to specify a model in dse1 but something isn't right. Anybody
have any tips?

 model-SS(F=f,G=g,H=h,Q=q,z0=z,P0=p)
Error in locateSS(model$R, constants$R, R, p, p, plist) :
  The dimension of something in the SS model structure is bad.

 dim(f)
[1] 5 5
 dim(g)
[1] 5 1
 dim(h)
[1] 1 5
 dim(q)
[1] 5 5
 dim(z)
[1] 5 1
 dim(p)
[1] 5 5

thanks,
Bob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] survSplit with data.frame containing a Surv object

2009-07-13 Thread Heinz Tuechler


Dear All,

since years I am struggling with Surv objects in data.frames. The 
following seems to have to do with it.
See below the modified example from the help page of survSplit. The 
original works, as expected. If, however, a Surv object is added to 
the data.frame, each record gets doubled.

Is there some solution other than avoiding Surv objects in data.frames?

Thanks,
Heinz


require(survival)

## from the help page
aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start,
  event=status,episode=i)

summary(aml)
summary(aml3)

coxph(Surv(time,status)~x,data=aml)
## the same
coxph(Surv(start,time,status)~x,data=aml3)

## added to show doubling of records
aml.so - aml
aml.so$surv.object - with(aml, Surv(time, status))

aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start,
 event=status,episode=i)
summary(aml3.so)

sessionInfo('survival')
R version 2.9.1 Patched (2009-07-07 r48910)
i386-pc-mingw32

locale:
LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252

attached base packages:
character(0)

other attached packages:
[1] survival_2.35-4

loaded via a namespace (and not attached):
[1] base_2.9.1  graphics_2.9.1  grDevices_2.9.1 methods_2.9.1
[5] splines_2.9.1   stats_2.9.1 utils_2.9.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] variance explained by each predictor in GAM

2009-07-13 Thread Kayce Anderson

Simon,That produced exactly what I was looking for.  Thanks so much for the
humble help.

KC

On Mon, Jul 13, 2009 at 9:10 AM, Simon Wood s.w...@bath.ac.uk wrote:

 You can get some idea by doing something like the following, which compares
 the r^2 for models b and b2, i.e. with and without s(x2).  It keeps the
 smoothing parameters fixed for the comparison. (s(x,fx=TRUE) removes
 penalization altogether btw, which is not what was wanted).

 dat - gamSim(1,n=400,dist=normal,scale=2)
 b-gam(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat)
 b2-gam(y~s(x0)+s(x1)+s(x3),sp=b$sp[-3],data=dat)
 summary(b2)$dev.expl
 summary(b)$dev.expl


 On Monday 13 July 2009 15:09, Kayce Anderson wrote:
  Many thanks for the advice David. I would really like to figure out,
  though, how to get the contribution of each factor to the Rsq - something
  like a Beta coefficient for GAM.   Ideas?
  KC
 
  On Sun, Jul 12, 2009 at 5:41 PM, David Winsemius
 dwinsem...@comcast.netwrote:
   On Jul 12, 2009, at 5:06 PM, Kayce Anderson wrote:
  
Hi,
  
   I am using mgcv:gam and have developed a model with 5 smoothed
   predictors and one factor.
  
   gam1 - gam(log.sp~ s(Spr.precip,bs=ts)  + s(Win.precip,bs=ts) +
 s(
   Spr.Tmin,bs=ts)  + s(P.sum.Tmin,bs=ts) + s( Win.Tmax,bs=ts)
   +factor(site),data=dat3)
  
  
   The total deviance explained = 70.4%.
  
  
   I would like to extract the variance explained by each predictor.  Is
   there
   a straightforward way to do this?  I have tried dropping a term and
   recalculating the model, but the edf's change if there is any
   correlation among variables, thereby making all of the relationships
   different.  I haven't yet figured out how to fix the smoothing terms-
 I
   get syntax error messages.  Among other variations, I tried, for
   example,
   log.sp~s(Spr.precip, sp=3.9, fx=TRUE) +...
  
   ?anova.gam
  
   Obviously I cannot test this with your dat3. You get an F-statistic for
   each s() term by default and you are referred to saummary.gam for
 further
   explanation.
  
   David Winsemius, MD
   Heritage Laboratories
   West Hartford, CT
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html and provide commented,
 minimal,
  self-contained, reproducible code.

 --
  Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
  +44 1225 386603  www.maths.bath.ac.uk/~sw283


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to perform a Likelihood-ratio test?

2009-07-13 Thread Jeff Xu


Hi Lars,

Using the definition of Likelihood-ratio test is straightforward, 

Jeff

Lars Bergemann wrote:
 
 
 Hi.
 
  
 
 I would like to use a likelihood-ratio test to compare a linear model
 (lm()) to another linear model containing the first one to see if the
 extra factors are needed - but I was unable to find any help on how to do
 that.
 
 Could you help me and tell me what to do? Or tell me where to find help on
 this topic?
 
  
 
 Many thanks in advance!
 
  
 
 Lars
 
 _
 
 s. It's easy!
 
 aspxmkt=en-us
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/How-to-perform-a-Likelihood-ratio-test--tp24460221p24463269.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] graph: axis label font

2009-07-13 Thread serbring


Hi, 

excuse me for my english, i am using R on windows and i have to do several
graphs with axis labels and the axis text thicks has a specified font type,
(Arial) and a specified font size. How can i do these? Thank you in advance

-- 
View this message in context: 
http://www.nabble.com/graph%3A-axis-label-font-tp24463974p24463974.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] getting a timeseries element into a string

2009-07-13 Thread tradenet


Thanks Ron.

I apologize for not properly specifying which class/package I was using.  I
am using the timeSeries class from the timeSeries Package, version 2100.83
from rmetrics.org

The code you supplied works for me! -- thanks again.

Warm regards,

Andrew


--

Ron Burns-2 wrote:
 
 Here is what I do:
 
 R library(fBasics)
 R ts-dummyDailySeries(x = rnorm(365), units = NULL, zone = , 
 FinCenter =)
 R class(ts)
 [1] timeSeries
 attr(,package)
 [1] timeSeries
 R (s - as.character(time(ts)[1]))
 [1] 1970-01-01
 R class(s)
 [1] character
 
 
 
 
 
 
 tradenet wrote:
 I added a reproducible example to my question...

 ts-dummyDailySeries(x = rnorm(365), units = NULL, zone = , FinCenter =
 )

   
 (ts[1,0])  #returns first date in return series
 
 GMT

 1970-01-01

   
 ttt-(sprintf(%s,ts[1,0]))
 

   
 print(ttt)
 
 character(0)

   
 ttt-(ts[1,0])
 

   
 print(ttt)
 
 GMT

 1970-01-01

 what I want to get is a string containing 1970-01-01

 Thank you.







 tradenet wrote:
   
 I have a timeseries object, ts, and want to get the first date in the
 series into a string so I can concatenate it with a SQL query. Input and
 output are shown below.  I must be missing something very basic, but I
 can't seem to pry the data (2008-07-01) into a string variable.  Any
 suggestions would be appreciated.

 Thank you,

 Andrew

 =script:
 class(ts)
 (ts[1,0])  #returns first date in return series
 ttt-(sprintf(%s,ts[1,0]))
 print(ttt)

 =output:


 
 class(ts)
   
 [1] timeSeries
 attr(,package)
 [1] timeSeries

 
 (ts[1,0])  #returns first date in return series
   
 GMT

 2008-07-01

 
 ttt-(sprintf(%s,ts[1,0]))
   
 print(ttt)
   
 character(0)




 

   
 
 
 -- 
 R. R. Burns
 Physicist (Retired)
 Oceanside, CA
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/getting-a-timeseries-element-into-a-string-tp24429876p24465343.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to perform a Likelihood-ratio test?

2009-07-13 Thread Emmanuel Charpentier

Le lundi 13 juillet 2009 à 13:04 +0200, Lars Bergemann a écrit :
 Hi.
 
  
 
 I would like to use a likelihood-ratio test to compare a linear model (lm()) 
 to another linear model containing the first one to see if the extra factors 
 are needed - but I was unable to find any help on how to do that.
 
 Could you help me and tell me what to do? Or tell me where to find help on 
 this topic?

?anova? Check the test argument ...:-)

Emmanuel Charpentier

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to perform a Likelihood-ratio test?


On Mon, 13 Jul 2009, Jeff Xu wrote:



Hi Lars,

Using the definition of Likelihood-ratio test is straightforward,



To which I would add, following the instructions in the _posting guide_ is 
equally straightforward. viz



help.search(loglikelihood)

takes you to

?stats::logLik

Chuck



Jeff

Lars Bergemann wrote:



Hi.



I would like to use a likelihood-ratio test to compare a linear model
(lm()) to another linear model containing the first one to see if the
extra factors are needed - but I was unable to find any help on how to do
that.

Could you help me and tell me what to do? Or tell me where to find help on
this topic?



Many thanks in advance!



Lars

_

s. It's easy!

aspxmkt=en-us
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: 
http://www.nabble.com/How-to-perform-a-Likelihood-ratio-test--tp24460221p24463269.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to perform a Likelihood-ratio test?

2009-07-13 Thread Jorge Ivan Velez

Dear Lars,
One way would be

anova(model1, model2)

See ?anova for more information.

HTH,

Jorge



On Mon, Jul 13, 2009 at 7:04 AM, Lars Bergemann
lars.bergem...@hotmail.comwrote:


 Hi.



 I would like to use a likelihood-ratio test to compare a linear model
 (lm()) to another linear model containing the first one to see if the extra
 factors are needed - but I was unable to find any help on how to do that.

 Could you help me and tell me what to do? Or tell me where to find help on
 this topic?



 Many thanks in advance!



 Lars

 _

 s. It's easy!

 aspxmkt=en-us
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help me get this function to work...

2009-07-13 Thread Chip Maney


I have a function (see below).  This function has one object, ID.  If I run the 
loops by itself using a character value (ie,VFFF1-7), then the loops work 
fine.  However, when I try to insert the character value via the function call, 
it doesn't work. I don't get an error, but the TotalCover.df dataframe does not 
update according to the loop criteria. Any obvious problems that you can see?

Cover Function#

#Define Variables
Quadrats.df-unique(data.df$Quadrat)
TotalCover.df-cbind(0:750/10,0,0,0,0,0,0)
colnames(TotalCover.df)- 
c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare)
Shrub.df-data.df[data.df$Layer==Shrub,]
Tree.df-data.df[data.df$Layer==Tree,]

Cover.fn-function(ID){

ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,]
for (j in 1:length(ShrubCover.df[,Quadrat])){
for (i in 1:751){
if(TotalCover.df[i,Station]=ShrubCover.df[j,Start]  
TotalCover.df[i,Station]= ShrubCover.df[j,Stop]) 
TotalCover.df[i,Shrub]- 1
} 
}
TreeCover.df-Tree.df[Tree.df$Quadrat==ID,]
for (j in 1:length(TreeCover.df[,Quadrat])){
for (i in 1:751){
if(TotalCover.df[i,Station]=TreeCover.df[j,Start]  
TotalCover.df[i,Station]= TreeCover.df[j,Stop]) 
TotalCover.df[i,Tree]- 1 
}
}
}


Cover.fn(VFFF1-7)

_

 right from Hotmail. Check it out.

M_WL_QA_HM_celebrity_photos2_072009cat=celebrity
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read.delim skips first column (why?)

2009-07-13 Thread Giovanni Marco Dall'Olio

Hi people,
I have a text file like this one posted:

snp_id  genechromosome  distance_from_gene_center
positionpop1pop2pop3pop4pop5pop6pop7
rs2129081   RAPT2   3   -129993 upstream  0.439009
1.169210NA  0.2330200.093042NA
-0.902596
rs1202698   RAPT2   3   -128695 upstream  NA
1.815000NA  0.3990791.8142701.382950
NA
rs1163207   RAPT2   3   -128224 upstream  NA  NA
NA  NA  NA  NA  NA
rs1834127   RAPT2   3   -128106 upstream  NA  NA
NA  NA  NA  NA  2.180670
rs2114211   RAPT2   3   -126738 upstream  -0.468279
-1.447620   NA  0.010616-0.414581   NA
0.550447
rs2113151   RAPT2   3   -124620 upstream  -0.897660
-1.971020   NA  -0.920327   -0.764658   NA
0.337127
rs2524130   RAPT2   3   -123029 upstream  -0.109795
-0.004646   -0.412059   1.1167400.667567
-0.924529   0.962841
rs1381318   RAPT2   3   -12818  upstream  -0.911662
-1.791580   NA  -0.945716   -1.239640   NA
0.004876
rs2113319   RAPT2   3   -122028 upstream  -0.911662
-1.738610   NA  -0.945716   -1.240950   NA  -0.005318

When I use read.delim (or any read function) on it, R skips the first
column, and I don' understand why.

For example:
$: R
 data = read.delim('snp_file.txt', head=T, sep='\t')

Now, I would expect data$snp_id to contain snp ids, and data$gene to contain
gene names; but it is not like this:

 data$snp_id
[1] RAPT2 RAPT2 RAPT2 RAPT2 RAPT2 RAPT2 RAPT2 RAPT2 RAPT2
Levels: RAPT2
 data$gene
[1] 3 3 3 3 3 3 3 3 3

 summary(data)
  snp_id   gene chromosome  distance_from_gene_center
 RAPT2:9   Min.   :3   Min.   :-129993   upstream:9
   1st Qu.:3   1st Qu.:-128224
   Median :3   Median :-126738
   Mean   :3   Mean   :-113806
   3rd Qu.:3   3rd Qu.:-123029
   Max.   :3   Max.   : -12818


 data$pop7
[1] NA NA NA NA NA NA NA NA NA


Notice that it did use snp_id as the header for the first column, but it
skips completely al the data from that column, and all the fields are
shifted, so the last column is filled with NA values.

What I am doing wrong? Can it be a problem of my data files? I have tried to
modify them a bit (add new columns, etc..) but it didn't work.

I am running R from an Ubuntu system:
 sessionInfo()
R version 2.9.1 (2009-06-26)
i486-pc-linux-gnu

locale:
LC_CTYPE=it_IT.UTF-8;LC_NUMERIC=C;LC_TIME=it_IT.UTF-8;LC_COLLATE=it_IT.UTF-8;LC_MONETARY=C;LC_MESSAGES=it_IT.UTF-8;LC_PAPER=it_IT.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base




-- 
Giovanni Dall'Olio, phd student
Department of Biologia Evolutiva at CEXS-UPF (Barcelona, Spain)

My blog on bioinformatics: http://bioinfoblog.it

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] graph: axis label font

2009-07-13 Thread Michael Knudsen

On Mon, Jul 13, 2009 at 5:31 PM, serbringbracard...@email.it wrote:

 excuse me for my english, i am using R on windows and i have to do several
 graphs with axis labels and the axis text thicks has a specified font type,
 (Arial) and a specified font size. How can i do these? Thank you in advance

Interesting question, I didn't know the answer to, so I tried to look
it up. There might be some help towards the bottom of this page:

http://www.statmethods.net/advgraphs/parameters.html

It seems to be specific for Windows, so I can't test it myself.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help get this simple function to work...

2009-07-13 Thread chipmaney


I have a function (see below).  This function has one object, ID.  If I run
the loops by themselves using a character value (ie,VFFF1-7) instead of
the function object, then the loops work fine.  However, when I try to
insert the character value via the function call, it doesn't work. I don't
get an error, but the TotalCover.df dataframe does not update according to
the loop criteria. Any obvious problems that you can see?
 
Cover Function#
 
#Define Variables
Quadrats.df-unique(data.df$Quadrat)
TotalCover.df-cbind(0:750/10,0,0,0,0,0,0)
colnames(TotalCover.df)-
c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare)
Shrub.df-data.df[data.df$Layer==Shrub,]
Tree.df-data.df[data.df$Layer==Tree,]
 
Cover.fn-function(ID){
 
ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,]
for (j in 1:length(ShrubCover.df[,Quadrat])){
for (i in 1:751){
if(TotalCover.df[i,Station]=ShrubCover.df[j,Start]
 TotalCover.df[i,Station]= ShrubCover.df[j,Stop]) 
TotalCover.df[i,Shrub]- 1
} 
}
TreeCover.df-Tree.df[Tree.df$Quadrat==ID,]
for (j in 1:length(TreeCover.df[,Quadrat])){
for (i in 1:751){
if(TotalCover.df[i,Station]=TreeCover.df[j,Start]
 TotalCover.df[i,Station]= TreeCover.df[j,Stop]) 
TotalCover.df[i,Tree]- 1 
}
}
}
 
 
Cover.fn(VFFF1-7)
-- 
View this message in context: 
http://www.nabble.com/Help-get-this-simple-function-to-work...-tp24466533p24466533.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] survSplit with data.frame containing a Surv object


On Mon, 13 Jul 2009, Heinz Tuechler wrote:


Dear All,

since years I am struggling with Surv objects in data.frames. The following 
seems to have to do with it.
See below the modified example from the help page of survSplit. The original 
works, as expected. If, however, a Surv object is added to the data.frame, 
each record gets doubled.

Is there some solution other than avoiding Surv objects in data.frames?


I think you can modify survSplit so that it will properly handle Surv 
objects.


Change this line:

newdata - lapply(data, rep, ntimes + 1)

to this:

newdata - lapply(data,
function(x) {
x - as.matrix(x)
x[rep(1:nrow(x), ntimes + 1),]
})

or something similar that results Surv objects being rep()'ed rowwise 
rather than elementwise and returned as objects of the right 
dimension (rather than as a vector).


Caveat: This works in the example you give, but I've not tested this 
extensively.


HTH,

Chuck





Thanks,
Heinz


require(survival)

## from the help page
aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start,
  event=status,episode=i)

summary(aml)
summary(aml3)

coxph(Surv(time,status)~x,data=aml)
## the same
coxph(Surv(start,time,status)~x,data=aml3)

## added to show doubling of records
aml.so - aml
aml.so$surv.object - with(aml, Surv(time, status))

aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start,
event=status,episode=i)
summary(aml3.so)

sessionInfo('survival')
R version 2.9.1 Patched (2009-07-07 r48910)
i386-pc-mingw32

locale:
LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252

attached base packages:
character(0)

other attached packages:
[1] survival_2.35-4

loaded via a namespace (and not attached):
[1] base_2.9.1  graphics_2.9.1  grDevices_2.9.1 methods_2.9.1
[5] splines_2.9.1   stats_2.9.1 utils_2.9.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help me get this function to work...

2009-07-13 Thread David Winsemius

In R a function only returns the last evaluation, so you need to wrap  
up all of the local results into a list at the end of the function.



On Jul 13, 2009, at 1:23 PM, Chip Maney wrote:



I have a function (see below).  This function has one object, ID.   
If I run the loops by itself using a character value (ie,VFFF1-7),  
then the loops work fine.  However, when I try to insert the  
character value via the function call, it doesn't work. I don't get  
an error, but the TotalCover.df dataframe does not update according  
to the loop criteria. Any obvious problems that you can see?


Cover Function#

#Define Variables
Quadrats.df-unique(data.df$Quadrat)
TotalCover.df-cbind(0:750/10,0,0,0,0,0,0)
   colnames(TotalCover.df)-  
c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare)

Shrub.df-data.df[data.df$Layer==Shrub,]
Tree.df-data.df[data.df$Layer==Tree,]

Cover.fn-function(ID){

   ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,]
   for (j in 1:length(ShrubCover.df[,Quadrat])){
   for (i in 1:751){
   if 
(TotalCover.df[i,Station]=ShrubCover.df[j,Start]   
TotalCover.df[i,Station]= ShrubCover.df[j,Stop])

   TotalCover.df[i,Shrub]- 1
   }
   }
   TreeCover.df-Tree.df[Tree.df$Quadrat==ID,]
   for (j in 1:length(TreeCover.df[,Quadrat])){
   for (i in 1:751){
   if 
(TotalCover.df[i,Station]=TreeCover.df[j,Start]   
TotalCover.df[i,Station]= TreeCover.df[j,Stop])

   TotalCover.df[i,Tree]- 1
   }
   }

# Perhaps something like:

list(ShrubCover.df, TreeCover.df, TotalCover.df)


}


Cover.fn(VFFF1-7)

_

right from Hotmail. Check it out.

M_WL_QA_HM_celebrity_photos2_072009cat=celebrity
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to run a R program with input arguments

2009-07-13 Thread edisonying


I am a beginner in R and know only a little about it yet. I have a script
written in R language, named as a.txt for example. I am using a Linux
machine, at present I only know that I can type R in the terminal and then
copy-paste the content in a.txt to the R's interface to execute the
program. However, I want to know if there is any method that allows me to
add some input arguments directly after the R in the terminal.
Specifically, I want to type the following in the cmd line:

   R (some flags or option) a.txt

then the program will begin to run. Besides, if the program will read a data
file first, can I also specify the data file in the command line? Then the
complete command will become:

   R  a.txt  data.txt

This is important for a beginner. Thanks very much!:working:
   
-- 
View this message in context: 
http://www.nabble.com/how-to-run-a-R-program-with-input-arguments-tp24465852p24465852.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] object .trPaths not found

2009-07-13 Thread Thomas Petzoldt


Hi,

using a variable named .trPaths is a feature and a frequent nuisance of 
recent Tinn-R and also explained in the Tinn-R docs.


Solution 1:

In Tinn-R select the menu:

 R -- Configure -- and then Temporarily or Permanent

Solution 2:

Manually edit file R-installation-directory/etc/Rconfig.site

The minimal line to add is:

.trPaths - paste(paste(Sys.getenv('APPDATA'), '\\Tinn-R\\tmp\\', 
sep=''), c('', 'search.txt', 'objects.txt', 'file.r', 'selection.r', 
'block.r', 'lines.r'), sep='')



Hope it helps

Thomas P.


Knut Krueger schrieb:

Uwe Ligges schrieb:




I have to admit that I have no idea what we are talking about here 
(yes, I tend to forget many things these days) - and you have not 
cited the original message, unfortunately (nor have you specifies R 
versions, Tinn-R versions and both OS versions, but just one) ...


Best wishes,
Uwe

Original question
rkevinbur...@charter.net wrote:
I am running an R script with Tinn-R (2.2.0.1) and I get the error 
message

 
Error in source(.trPaths[4], echo = TRUE, max.deparse.length = 150) : 
  object .trPaths not found


Any solutions?

System

R 2.8.1
Tinn-R 2.2.02
windows Xp professional Servicepack 3

This combination works on my Pc without problems
And the code is working on the other system with Tinn-R problems
with the R-Editor without problems
so it seems to be a Tinn_R problem

but maybe anybody has a solution for that.

Regards Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help me get this function to work...

On Mon, Jul 13, 2009 at 12:08 PM, David Winsemiusdwinsem...@comcast.net wrote:
 In R a function only returns the last evaluation, so you need to wrap up all
 of the local results into a list at the end of the function.



SNIP

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT


How important it is to wrap the list in a return statement, ala

return(list(ShrubCover.df, TreeCover.df, TotalCover.df))

or

answer - list(ShrubCover.df, TreeCover.df, TotalCover.df)
return(answer)

Thanks,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to run a R program with input arguments

2009-07-13 Thread Henrique Dallazuanna

See commandArgs() function.

On Mon, Jul 13, 2009 at 4:15 PM, edisonying edisonying1...@gmail.comwrote:

I am a beginner in R and know only a little about it yet. I have a script
written in R language, named as a.txt for example. I am using a Linux
machine, at present I only know that I can type R in the terminal and
then
copy-paste the content in a.txt to the R's interface to execute the
program. However, I want to know if there is any method that allows me to
add some input arguments directly after the R in the terminal.
Specifically, I want to type the following in the cmd line:

R (some flags or option) a.txt

then the program will begin to run. Besides, if the program will read a
data
file first, can I also specify the data file in the command line? Then the
complete command will become:

R a.txt data.txt

This is important for a beginner. Thanks very much!:working:

--
View this message in context:
http://www.nabble.com/how-to-run-a-R-program-with-input-arguments-tp24465852p24465852.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] \dQuote in packages

2009-07-13 Thread Rebecca Sela

Thank you!  (That was easy to fix.)

How does one deal with quoting (in \reference)?  The following line causes 
problems:
\references{Sela, Rebecca J., and Simonoff, Jeffrey S., \dQuote{RE-EM Trees: A 
New Data Mining Approach for Longitudinal Data}.}
The error given is:
Warning in parse_Rd(./man/predict.Rd, encoding = unknown) :
  ./man/predict.Rd:28: unknown macro '\dquote'
*** error on file ./man/predict.Rd
Error : ./man/predict.Rd:28: Unrecognized macro \dquote

The manual for writing R packages said I should not just use the character .  
What should I be using here?

Thanks again!

Rebecca



- Original Message -
From: Uwe Ligges lig...@statistik.tu-dortmund.de
To: Rebecca Sela rs...@stern.nyu.edu
Cc: r-help r-help@r-project.org
Sent: Friday, July 10, 2009 8:17:44 PM GMT -05:00 US/Canada Eastern
Subject: Re: [R] \dQuote in packages



Rebecca Sela wrote:
 Here is one Rd file with problems, now inline so that it can be read:
 
 \name{simpleREEMdata}
 \docType{data}
 \alias{simpleREEMdata}
 \title{Sample Data for RE-EM trees}
 \description{
 This data set is consists of a panel of 50 individuals with 12 observations 
 per individual.  The data is based on a regression tree with an initial split 
 based on a dummy variable (\code{D}) and a second split based on time in the 
 branch where \code{D=1}.  The observations include both randomly generated 
 individual-specific effects and observation-specific errors.
 }
 \format{
 The data has 600 rows and 5 columns.  The columns are: 

insert here:

\itemize{

 \item{\code{Y}}{the target variable}
 \item{\code{t}}{a numeric predictor (time)}
 \item{\code{D}}{a catergorical predictor with two levels, 0 and 1}
 \item{\code{ID}}{the identifier for each individual}
 \item{\code{X}}{another covariate (which is intentionally unrelated to the 
 target variable)}

insert here:

}


or in other words, you need an itemize environment in order to use \item 
within \format, see the manual Writing R Extensions.

Best,
Uwe


 }
 \references{Sela, Rebecca J., and Simonoff, Jeffrey S., \dQuote{RE-EM Trees: 
 A New Data Mining Approach for Longitudinal Data}.}
 \keyword{datasets}
 
 Thanks again for your help!
 
 Rebecca
 
 - Original Message -
 From: Uwe Ligges lig...@statistik.tu-dortmund.de
 To: Rebecca Sela rs...@stern.nyu.edu
 Cc: r-help r-help@r-project.org
 Sent: Thursday, July 9, 2009 6:05:46 AM GMT -05:00 US/Canada Eastern
 Subject: Re: [R] \dQuote in packages
 
 Rebecca,
 
 the attachments have been stripped off by the mailing list.
 
 
 Rebecca Sela wrote:
 That's good to know.  I have attached three Rd files that gave errors 
 (others gave identical errors).  I would love to know what is wrong with 
 them.

 I'm using 2.1.1 because that is what is installed on the Linux computer I 
 have access to.  (I haven't bothered figuring out how to assemble a package 
 in Windows.)
 
 You should *really* upgrade! That version is outdated for several years now.
 
 How to do it on Windows: See the R Installation and Administration 
 manual with its corresponding section.
 
 Best,
 Uwe
 
 
 Thank you for your help!

 Rebecca


 - Original Message -
 From: Uwe Ligges lig...@statistik.tu-dortmund.de
 To: Rebecca Sela rs...@stern.nyu.edu
 Cc: r-help r-help@r-project.org
 Sent: Wednesday, July 8, 2009 6:11:33 PM GMT -05:00 US/Canada Eastern
 Subject: Re: [R] \dQuote in packages

 The difference you are experiencing is the new Rd2 parser that is more 
 picky now (but also prevents to produce wrong documentation files).

 If you make the code of the Rd available, someone might be able to help.

 Are you really under R-2.1.1 ??? That is really ancient!


 Best,
 Uwe Ligges



 Rebecca Sela wrote:
 I am in the process of submitting a package to CRAN.  R CMD check ran 
 successfully on the package on my local computer, using R version 2.1.1.  
 However, on the computers for CRAN (with version 2.10.0), the following 
 errors occurred:

 Warning in parse_Rd(./man/predict.Rd, encoding = unknown) :
   ./man/predict.Rd:28: unknown macro '\dquote'
 *** error on file ./man/predict.Rd
 Error : ./man/predict.Rd:28: Unrecognized macro \dquote
 Warning in parse_Rd(./man/print.Rd, encoding = unknown) :
   ./man/print.Rd:17: unexpected UNKNOWN '\sideeffects'
 Warning in parse_Rd(./man/simpleREEMdata.Rd, encoding = unknown) :
   ./man/simpleREEMdata.Rd:10: unknown macro '\item'

 Are \dquote, \sideeffects, and \item not supported in newer versions of R?  
 Is there some underlying problem that I should fix that makes these show up?

 Thank you very much.

 Rebecca

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 

 __

[R] .RData signature

2009-07-13 Thread stenka1


   Is there a way to identify the beginning and end of an .RData file?
   I am trying to restore lost files on opensolaris where the table of contents
   is overwritten and will be reading the raw disk byte by byte.
   I looked at several files in hex and the end seems different, while most of
   them, but not all begin with RDX2.X
   Thank you.
   Stephen C. Bond
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error: optim(rho, n2ll.rho, method = method, control = control, beta = parm$beta, : initial value in 'vmmin' is not finite

2009-07-13 Thread Ravi Varadhan

Hi Brandy,

The `vmmin' refers to a variable metric algorithm, which is a quasi-Newton
method for optimization.  This algorithm is used when `method=BFGS' in the
optim() call.  The quasi-Newton methods iteratively build an approopriate
Hessian matrix, which is of dimension p x p, where p is the problem size.
In your case the hessian matrix is 4000 x 4000, which BFGS isi unable to
handle.  

If the code for the function lnam() and dependent functions is entirely in
R, you can try a different algorithm than BFGS.  For example, you can try
CG, which can handle high-dimensional optimization.  However, before doing
that I would make sure that CG works well and gives same answer as BFGS
on smaller problems.  If CG doesn't cut it, then I would try the spg()
function in the BB package.

If the code for lnam() is not in R, then I would contact the package author
to help you out with incorporating other optimizers for your
high-dimensional problem.

Hope this helps,
Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Brandy Lee Aven
Sent: Thursday, July 09, 2009 11:11 PM
To: r-help@r-project.org
Subject: [R] error: optim(rho, n2ll.rho, method = method, control = control,
beta = parm$beta, : initial value in 'vmmin' is not finite

I am trying to use the lnam autocorrelation model from the SNA package. I
have it running for smaller adjacency matrices (1,500) it works just fine
but when my matrices are bigger 4000+. I get the error: 

 lnam1_01.adj- lnam(data01$adopt,x01,ec2001.csr)
Error in optim(rho, n2ll.rho, method = method, control = control, beta =
parm$beta,  : 
  initial value in 'vmmin' is not finite


I have looked at the lnam code and cant even figure out what vmmin is. 
Is there anyway around this? Am I doing something wrong? What makes me think
that its about the size of the adjacency matrix is that I can run the same
command on similar objects that are just smaller and it works fine. 

please help!

sessionInfo()
R version 2.9.1 (2009-06-26)
x86_64-pc-linux-gnu 

locale:
C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] numDeriv_2009.2-1 sna_2.0-1

loaded via a namespace (and not attached):
[1] tools_2.9.1

 class(data01$adopt) #This is the response vector y
[1] integer

data01$adopt[1:10] # Its just a binary outcome for all vertices
[1] 0 0 0 0 0 0 0 0 0 0 ..until 4,003

 class(x01) #X01 is a matrix of my six covariates for all vertices

[1] matrix

#here is the an example of the data
x01[1:10,1:6]
on01 indegree outdegree between eigen numalters01
1   1   0   0   0   1   1
19  1   0   1   0   0   1
123 1   0   1   0   0   1
140 1   0   1   0   0   1
169 1   0   1   0   0   1
189 1   0   1   0   0   1
195 1   0   1   0   0   2
204 1   0   1   0   0   1
231 1   0   2   0   0   1
252 1   0   3   0   0   4

# this is the adjacency matrix (in Sparse matrix format) that causes the
error. I have another that is 10,500 and does the same thing.
dim(ec2001.csr)
[1] 4003 4003

class(ec2001.csr)
[1] matrix.csr
attr(,package)
[1] SparseM

ec2001.csr[1:10,1:10] #here is what it looks like
1 19 123 140 169 189 195 204 231 252
1   1  0   0   0   0   0   0   0   0   0
19  0  0   0   1   0   0   0   0   0   0
123 0  0   0   0   0   0   0   0   0   0
140 0  0   0   0   0   0   0   0   0   0
169 0  0   0   0   0   0   0   0   0   0
189 0  0   1   0   0   0   0   0   0   0
195 0  0   0   0   0   0   0   0   0   0
204 0  0   0   0   0   0   0   0   0   0
231 0  0   0   0   0   0   0   0   0   0
252 0  0   0   0   0   0   0   0   0   0


#There are also no infinite values in the objects. 
is.infinite(x01)
[1] FALSE . N

is.infinite(data01$adopt)
[1] FALSE .N

is.infinite(ec2001.csr)
[1] FALSE

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

[R] regression with replication

2009-07-13 Thread Agus Susanto

Dear all,
I would like to fit a linear regression with replication (on each year,
observation is replicated, e.g 4 times). The independent variable ranges
for instance 1-5 year, so I expect to have a linear fit of 5 points.
For that purpose I do these (with dummy variables x and y):

x-rep(seq(1:5),4)
y-rnorm(20)
linreg-lm(y~x)
fitted.values(linreg)  # why produce 20 points of estimate?
predict(linreg)# why produce 20 points of estimate?

Please somebody explain:
1. why both fitted.values and predict functions produced 20 points of
estimate, NOT 5 points.
2. is lm(y~x) correct to solve this regression case, or there's a
correct procedure.

Many thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help me get this function to work...

2009-07-13 Thread Bert Gunter

How important it is to wrap the list in a return statement, ala

return(list(ShrubCover.df, TreeCover.df, TotalCover.df))

or

answer - list(ShrubCover.df, TreeCover.df, TotalCover.df)
return(answer)
---

Completely Un.

Consult the R Docs, especially the R Language Definition manual, for answers
to this and your other numerous basic questions. That's what they're there
for.

-- Bert Gunter
Genentech, Inc.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] testing equality of two dependent correlations + normality issue

2009-07-13 Thread Jaroslav Hlinka


Hi, I am turning to you with a (hopefully simple?) stats question. I would
like to test equality of two correlation coefficients in a setting with
three variables X,Y,Z, i.e. equality of r(X,Y) and r(Z,Y). I have found a
formula to transform the 2 dependent correlations difference to
t-distribution with N-3 df:
t = (rxy - rzy)* SQRT[{(n - 3)(1 + rxz)}/ {2(1 - rxy^2 - rxz^2 - rzy^2 +
2rxy*rxz*rzy)}]
(Blalock, H., 1972. Social Statistics. NY: McGraw-Hill. Page 406-7). Am
actually not sure whether this is exact or approximate (even given normality
assumption, the Fisher's Z-transform which this is - I assume - based on, is
approximate, right?).
But to make it a bit more complicated, Shapiro-Wilks test of normality gives
p=0.022 for variable X. Therefore assuming normality may not be safe
(justifiable) at all? What do I do then? Do I report this test as
assymptotically valid, or do I run some other test?

Any ideas? Many thanks in advance,
Jaroslav

-- 
View this message in context: 
http://www.nabble.com/testing-equality-of-two-dependent-correlations-%2B-normality-issue-tp24465768p24465768.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] re ference

2009-07-13 Thread Derek Lacoursiere


Hello R users,

I have developed code in R that generates populations of non-overlapping
random polygons (polygonal maps) for testing certain stereological
estimation methods.  I have tried to look for articles or papers online on
the generation of random polygonal maps and have found very little.  There
seems to be a much greater concern with generating single random polygons
rather than whole maps of random polygons.  Does anyone have an idea of
where I should look or what keywords to use in my search?

I really appreciate your patience with R novices like myself.
-- 
View this message in context: 
http://www.nabble.com/reference-tp24467408p24467408.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reducing arrays for comparison with each other...

Hi,
   First up, thanks again for all the help I'm getting on this list.
I'm making great headway in analyzing my experimental data on an
experiment by experiment basis. No way I could have done this in the
time I've done it without your help.

   This email is partially a question about R but is also soliciting
overall guidance if anyone wants to give it. I somehow managed to get
an electrical engineering degree decades ago from the University of
California system without ever being required to take a statistics
class so in that area I'm lacking background. Maybe there's a good
subject someone can point me toward. (Keep it simple though!)

   OK, I've now got dozens of experiments sitting in individual
data.frames. In most cases the data.frames look more or less like the
example below - Experiment Start time down the left side, Experiment
Completion time across the top and results in the cells. The following
data started at 830 and is 3 minute data so the first measurement came
at 833, then 836, etc.

EnTime  836  842   845  848   851  854  900   903
1  833 -3860  -772 -938  -7720 -386
2  8360 -386 00 0 -246  714  -632
3  83900 00  -38600  -772
4  84200  -3860 00 -682 0
5  84500 00 000 0
6  84800 00  -38600 0
7  85100 00 000 0
8  85400 00 00 -386 0
9  85700 00 000 0
10 90000 00 000 0
11 90300 00 00

   The issue I'm thinking about now is how to compare experiments when
the time increment for each is different? This one is 3 minutes,
others are 1 minute, 5 minutes, 13 minutes, etc. My initial thought is
to try and roll things up into common sizes, like 30 minute or 60
minute chunks. Granted, I won't get the same number of measurements in
each chunk, but it's a start and would yield a data.frame that's easy
to view in the Rgui console so I'd be happy with that, but is that a
good method in terms of the statistics of the problem? should I be
paying attention to the number of experiments, maybe as a percentage
of the whole, or something else? I don't know.

   Anyway, I'd like to try rolling this data up by time increment to
start and I'm wondering if there any easy ways to do that? Taking 15
minute increments I would like to get something like

   EnTime845  900 915
1  845  SUM SUM SUM
2  900  SUM SUM SUM
4  915  SUM SUM SUM

where the SUM is the sum of the pieces making up this new cell.

I tried using subset but apparently it doesn't work on arrays like
this, telling me that EnTime=830 isn't valid for factors. I can
certainly find ways to munge the data set, changing EnTime values but
that seems like stuff I shouldn'y be doing. that said it's pretty easy
to do I suspect. (Modulo 15, etc.)

   Is there a way to use cast() to grab time increments. Instead of
cast(Results, EnTime ~ ExTime, ...) returning by the exact value, can
cast collect values together in groups?

   Any ideas appreciated before I start banging away at doing my less
interesting methods.

Thanks,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help me get this function to work...


On 7/13/2009 3:21 PM, Mark Knecht wrote:

On Mon, Jul 13, 2009 at 12:08 PM, David Winsemiusdwinsem...@comcast.net wrote:

In R a function only returns the last evaluation, so you need to wrap up all
of the local results into a list at the end of the function.




SNIP


David Winsemius, MD
Heritage Laboratories
West Hartford, CT



How important it is to wrap the list in a return statement, ala

return(list(ShrubCover.df, TreeCover.df, TotalCover.df))

or

answer - list(ShrubCover.df, TreeCover.df, TotalCover.df)
return(answer)


Those are almost identical.

The only difference is that the second version will leave the local 
variable answer in the evaluation frame.  For most functions the 
evaluation frame disappears after the return, but sometimes it lives a 
while longer.


For example, you can return it explicitly by putting environment() as 
one of the return values.  More commonly it is returned implicitly as 
the environment of a function created during evaluation, e.g.


 buildf - function() {
+   f - function(x) x + 1
+   return(f)
+ }
 g - buildf()
 g(5)
[1] 6
 ls(environment(g))
[1] f

The last command looks in the environment of g, and sees the local 
variable f there, of which g is a copy.  If you had done


answer - f
return(answer)

you'd also see answer there.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] SweaveListingUtils question

2009-07-13 Thread Karsten Weinert

Hello,
recently I read about the SweaveListingUtils package and now I want to try
it out. However, I can not make it work...

Below a minimal example. The problem seems to be the following line
(generated by SweaveListingPreparations()?):

\ifthenelse{\boolean{swe...@gin}}{\setkeys{gin}{width=0.6\textwidth}}{}%

If I comment out this line, it works.

What can I do about this?

I am using the \begin{Scode} notation instead of Rnw files. May this be the
problem? For some technical reason, I would like to stick on this notation.

Any help appreciated,

kind regards,
Karsten.

 begin example

\documentclass[9pt]{beamer}

\usepackage{fancyvrb}
\usepackage{listings}

% choose language and char set
\usepackage[ngerman]{babel}
\usepackage[ansinew]{inputenc}
%

\SweaveOpts{prefix.string=Routput/parcel, keep.source=TRUE}

\begin{Scode}{results=tex, echo=FALSE}
require(SweaveListingUtils)
SweaveListingoptions(intermediate = FALSE)
SweaveListingPreparations()
\end{Scode}


\begin{document}


\begin{frame}[fragile]
\frametitle{Your title}

\begin{lstlisting}options(digits=3)
set.seed(1)

require(Hmisc)

age - rnorm(1000,50,10)
sex - sample(c('female','male'),1000,TRUE)
out - histbackback(split(age, sex), probability=TRUE, xlim=c(-.06,.06),
main = 'Back to Back Histogram')

#! just adding color
barplot(-out$left, col=red , horiz=TRUE, space=0, add=TRUE, axes=FALSE)
barplot(out$right, col=blue, horiz=TRUE, space=0, add=TRUE, axes=FALSE)
\end{lstlisting}
\end{frame}

\begin{Scode}{echo=FALSE}
  unloadNamespace(SweaveListingUtils)
\end{Scode}

\end{document}

% end example

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] .RData signature


On 7/13/2009 3:38 PM, sten...@go.com wrote:

   Is there a way to identify the beginning and end of an .RData file?
   I am trying to restore lost files on opensolaris where the table of contents
   is overwritten and will be reading the raw disk byte by byte.
   I looked at several files in hex and the end seems different, while most of
   them, but not all begin with RDX2.X


No, there is nothing special written at the end of a file.  Many .RData 
files are compressed (by internal gzip), so the RDX2 header is hidden, 
and you'll see bytes 1F 8B instead.  (I don't think we write the trailer 
bytes on the gzip stream, but I'm not sure about that.)


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] \dQuote in packages


On 7/13/2009 3:27 PM, Rebecca Sela wrote:

Thank you!  (That was easy to fix.)

How does one deal with quoting (in \reference)?  The following line causes 
problems:
\references{Sela, Rebecca J., and Simonoff, Jeffrey S., \dQuote{RE-EM Trees: A 
New Data Mining Approach for Longitudinal Data}.}
The error given is:
Warning in parse_Rd(./man/predict.Rd, encoding = unknown) :
  ./man/predict.Rd:28: unknown macro '\dquote'
*** error on file ./man/predict.Rd
Error : ./man/predict.Rd:28: Unrecognized macro \dquote

The manual for writing R packages said I should not just use the character .  
What should I be using here?


Are you sure you showed us the right line?  The error message says 
\dquote, but the line contains \dQuote.  \dquote is wrong, \dQuote 
is fine.


Duncan Murdoch



Thanks again!

Rebecca



- Original Message -
From: Uwe Ligges lig...@statistik.tu-dortmund.de
To: Rebecca Sela rs...@stern.nyu.edu
Cc: r-help r-help@r-project.org
Sent: Friday, July 10, 2009 8:17:44 PM GMT -05:00 US/Canada Eastern
Subject: Re: [R] \dQuote in packages



Rebecca Sela wrote:

Here is one Rd file with problems, now inline so that it can be read:

\name{simpleREEMdata}
\docType{data}
\alias{simpleREEMdata}
\title{Sample Data for RE-EM trees}
\description{
This data set is consists of a panel of 50 individuals with 12 observations per 
individual.  The data is based on a regression tree with an initial split based 
on a dummy variable (\code{D}) and a second split based on time in the branch 
where \code{D=1}.  The observations include both randomly generated 
individual-specific effects and observation-specific errors.
}
\format{
The data has 600 rows and 5 columns.  The columns are: 


insert here:

\itemize{


\item{\code{Y}}{the target variable}
\item{\code{t}}{a numeric predictor (time)}
\item{\code{D}}{a catergorical predictor with two levels, 0 and 1}
\item{\code{ID}}{the identifier for each individual}
\item{\code{X}}{another covariate (which is intentionally unrelated to the 
target variable)}


insert here:

}


or in other words, you need an itemize environment in order to use \item 
within \format, see the manual Writing R Extensions.


Best,
Uwe



}
\references{Sela, Rebecca J., and Simonoff, Jeffrey S., \dQuote{RE-EM Trees: A 
New Data Mining Approach for Longitudinal Data}.}
\keyword{datasets}

Thanks again for your help!

Rebecca

- Original Message -
From: Uwe Ligges lig...@statistik.tu-dortmund.de
To: Rebecca Sela rs...@stern.nyu.edu
Cc: r-help r-help@r-project.org
Sent: Thursday, July 9, 2009 6:05:46 AM GMT -05:00 US/Canada Eastern
Subject: Re: [R] \dQuote in packages

Rebecca,

the attachments have been stripped off by the mailing list.


Rebecca Sela wrote:

That's good to know.  I have attached three Rd files that gave errors (others 
gave identical errors).  I would love to know what is wrong with them.

I'm using 2.1.1 because that is what is installed on the Linux computer I have 
access to.  (I haven't bothered figuring out how to assemble a package in 
Windows.)


You should *really* upgrade! That version is outdated for several years now.

How to do it on Windows: See the R Installation and Administration 
manual with its corresponding section.


Best,
Uwe



Thank you for your help!

Rebecca


- Original Message -
From: Uwe Ligges lig...@statistik.tu-dortmund.de
To: Rebecca Sela rs...@stern.nyu.edu
Cc: r-help r-help@r-project.org
Sent: Wednesday, July 8, 2009 6:11:33 PM GMT -05:00 US/Canada Eastern
Subject: Re: [R] \dQuote in packages

The difference you are experiencing is the new Rd2 parser that is more 
picky now (but also prevents to produce wrong documentation files).


If you make the code of the Rd available, someone might be able to help.

Are you really under R-2.1.1 ??? That is really ancient!


Best,
Uwe Ligges



Rebecca Sela wrote:

I am in the process of submitting a package to CRAN.  R CMD check ran 
successfully on the package on my local computer, using R version 2.1.1.  
However, on the computers for CRAN (with version 2.10.0), the following errors 
occurred:

Warning in parse_Rd(./man/predict.Rd, encoding = unknown) :
  ./man/predict.Rd:28: unknown macro '\dquote'
*** error on file ./man/predict.Rd
Error : ./man/predict.Rd:28: Unrecognized macro \dquote
Warning in parse_Rd(./man/print.Rd, encoding = unknown) :
  ./man/print.Rd:17: unexpected UNKNOWN '\sideeffects'
Warning in parse_Rd(./man/simpleREEMdata.Rd, encoding = unknown) :
  ./man/simpleREEMdata.Rd:10: unknown macro '\item'

Are \dquote, \sideeffects, and \item not supported in newer versions of R?  Is 
there some underlying problem that I should fix that makes these show up?

Thank you very much.

Rebecca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] object .trPaths not found

2009-07-13 Thread Knut Krueger


Thomas Petzoldt schrieb:


Hi,

using a variable named .trPaths is a feature and a frequent nuisance 
of recent Tinn-R and also explained in the Tinn-R docs.


Solution 1:

In Tinn-R select the menu:

 R -- Configure -- and then Temporarily or Permanent

Solution 2:

Manually edit file R-installation-directory/etc/Rconfig.site

The minimal line to add is:

.trPaths - paste(paste(Sys.getenv('APPDATA'), '\\Tinn-R\\tmp\\', 
sep=''), c('', 'search.txt', 'objects.txt', 'file.r', 'selection.r', 
'block.r', 'lines.r'), sep='')



Hope it helps


no for me - I do not switched to permanent and i did not have the 
Rconfig.site file edited but it is working

So i do not know *why it is working* ;-)

I hope yes for the others (I will forward it) on the computers where it 
is not working


Thank you Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] survSplit with data.frame containing a Surv object

2009-07-13 Thread Heinz Tuechler


At 20:18 13.07.2009, Charles C. Berry wrote:

On Mon, 13 Jul 2009, Heinz Tuechler wrote:


Dear All,

since years I am struggling with Surv objects in data.frames. The 
following seems to have to do with it.
See below the modified example from the help page of survSplit. The 
original works, as expected. If, however, a Surv object is added to 
the data.frame, each record gets doubled.

Is there some solution other than avoiding Surv objects in data.frames?


I think you can modify survSplit so that it will properly handle Surv objects.

Change this line:

newdata - lapply(data, rep, ntimes + 1)

to this:

newdata - lapply(data,
function(x) {
x - as.matrix(x)
x[rep(1:nrow(x), ntimes + 1),]
})

or something similar that results Surv objects being rep()'ed 
rowwise rather than elementwise and returned as objects of the right 
dimension (rather than as a vector).


Caveat: This works in the example you give, but I've not tested this 
extensively.


HTH,

Chuck





Thanks,
Heinz


require(survival)

## from the help page
aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start,
  event=status,episode=i)

summary(aml)
summary(aml3)

coxph(Surv(time,status)~x,data=aml)
## the same
coxph(Surv(start,time,status)~x,data=aml3)

## added to show doubling of records
aml.so - aml
aml.so$surv.object - with(aml, Surv(time, status))

aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start,
event=status,episode=i)
summary(aml3.so)

sessionInfo('survival')
R version 2.9.1 Patched (2009-07-07 r48910)
i386-pc-mingw32

locale:
LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252

attached base packages:
character(0)

other attached packages:
[1] survival_2.35-4

loaded via a namespace (and not attached):
[1] base_2.9.1  graphics_2.9.1  grDevices_2.9.1 methods_2.9.1
[5] splines_2.9.1   stats_2.9.1 utils_2.9.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
Dept of 
Family/Preventive Medicine

E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901


Thank you Chuck,

it seems to work also with my real data, but I noted that in the 
example aml$x, which is a factor, gets converted to character in 
aml3.so. Maybe, if I find the time, I should look at 
as.data.frame.matrix and rbind for Surv objects.


Thanks again,
Heinz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help get this simple function to work...

2009-07-13 Thread jim holtman

Your basic problem is that you are trying to change the value of a
global object (TotalCover.df) from within a function.  A better
approach is to pass in the object that you are changing and then
return it as the value of the function.  You can then assign it back
to the original object if you want.  I think something like this
should work, but the code you have is not executable since some
objects are not defined (e.g., data.df):

#Define Variables
Quadrats.df-unique(data.df$Quadrat)
TotalCover.df-cbind(0:750/10,0,0,0,0,0,0)
   colnames(TotalCover.df)-
c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare)
Shrub.df-data.df[data.df$Layer==Shrub,]
Tree.df-data.df[data.df$Layer==Tree,]

Cover.fn-function(ID, TC.df){

   ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,]
   for (j in 1:length(ShrubCover.df[,Quadrat])){
   for (i in 1:751){
   if(TC.df[i,Station]=ShrubCover.df[j,Start]
 TC.df[i,Station]= ShrubCover.df[j,Stop])
   TC.df[i,Shrub]- 1
   }
   }
   TreeCover.df-Tree.df[Tree.df$Quadrat==ID,]
   for (j in 1:length(TreeCover.df[,Quadrat])){
   for (i in 1:751){
   if(TC.df[i,Station]=TreeCover.df[j,Start]
 TC.df[i,Station]= TreeCover.df[j,Stop])
   TC.df[i,Tree]- 1
   }
   }
TC.df   # return value
}


Cover.fn(VFFF1-7, TotalCover.df)

On Mon, Jul 13, 2009 at 2:06 PM, chipmaneychipma...@hotmail.com wrote:

 I have a function (see below).  This function has one object, ID.  If I run
 the loops by themselves using a character value (ie,VFFF1-7) instead of
 the function object, then the loops work fine.  However, when I try to
 insert the character value via the function call, it doesn't work. I don't
 get an error, but the TotalCover.df dataframe does not update according to
 the loop criteria. Any obvious problems that you can see?

 Cover Function#

 #Define Variables
 Quadrats.df-unique(data.df$Quadrat)
 TotalCover.df-cbind(0:750/10,0,0,0,0,0,0)
    colnames(TotalCover.df)-
 c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare)
 Shrub.df-data.df[data.df$Layer==Shrub,]
 Tree.df-data.df[data.df$Layer==Tree,]

 Cover.fn-function(ID){

    ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,]
        for (j in 1:length(ShrubCover.df[,Quadrat])){
            for (i in 1:751){
                if    (TotalCover.df[i,Station]=ShrubCover.df[j,Start]
  TotalCover.df[i,Station]= ShrubCover.df[j,Stop])
                    TotalCover.df[i,Shrub]- 1
            }
        }
    TreeCover.df-Tree.df[Tree.df$Quadrat==ID,]
        for (j in 1:length(TreeCover.df[,Quadrat])){
            for (i in 1:751){
                if    (TotalCover.df[i,Station]=TreeCover.df[j,Start]
  TotalCover.df[i,Station]= TreeCover.df[j,Stop])
                    TotalCover.df[i,Tree]- 1
            }
        }
 }


 Cover.fn(VFFF1-7)
 --
 View this message in context: 
 http://www.nabble.com/Help-get-this-simple-function-to-work...-tp24466533p24466533.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] re gression with replication

2009-07-13 Thread Ben Bolker




AgusSusanto wrote:
 
 Dear all,
 I would like to fit a linear regression with replication (on each year,
 observation is replicated, e.g 4 times). The independent variable ranges
 for instance 1-5 year, so I expect to have a linear fit of 5 points.
 For that purpose I do these (with dummy variables x and y):
 
 x-rep(seq(1:5),4)
 y-rnorm(20)
 linreg-lm(y~x)
 fitted.values(linreg)  # why produce 20 points of estimate?
 predict(linreg)# why produce 20 points of estimate?
 
 Please somebody explain:
 1. why both fitted.values and predict functions produced 20 points of
 estimate, NOT 5 points.
 
   Because R doesn't know your regression is replicated. It just knows you
 gave it
 20 pairs of predictors and responses.
 
Perhaps you want
 
 predict(linreg,newdata=list(x=unique(x)))   ?
 
 2. is lm(y~x) correct to solve this regression case, or there's a
 correct procedure.
 
 

If you expected the samples within year to be independent of each other
(which seems a little dicey) then this would be correct; otherwise the
coefficients
will be correct but the residual df etc. will be wrong.  You could try
something like

ymeans  - tapply(y,x,mean)
xvals - unique(x)
lm(ymeans ~ xvals)

or a more complicated approach (probably unnecessary here):

library(nlme)

lme(y~x,random=~1|year)

[I think]


-- 
View this message in context: 
http://www.nabble.com/regression-with-replication-tp24468326p24470213.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] survSplit with data.frame containing a Surv object


On Mon, 13 Jul 2009, Heinz Tuechler wrote:


At 20:18 13.07.2009, Charles C. Berry wrote:

On Mon, 13 Jul 2009, Heinz Tuechler wrote:

 Dear All,
 
 since years I am struggling with Surv objects in data.frames. The 
 following seems to have to do with it.
 See below the modified example from the help page of survSplit. The 
 original works, as expected. If, however, a Surv object is added to the 
 data.frame, each record gets doubled.

 Is there some solution other than avoiding Surv objects in data.frames?

I think you can modify survSplit so that it will properly handle Surv 
objects.


Change this line:

 newdata - lapply(data, rep, ntimes + 1)

to this:

 newdata - lapply(data,
 function(x) {
 x - as.matrix(x)
 x[rep(1:nrow(x), ntimes + 1),]
 })

or something similar that results Surv objects being rep()'ed rowwise 
rather than elementwise and returned as objects of the right dimension 
(rather than as a vector).


Caveat: This works in the example you give, but I've not tested this 
extensively.


HTH,

Chuck



 
 Thanks,

 Heinz
 
 
 require(survival)
 
 ## from the help page

 aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start,
event=status,episode=i)
 
 summary(aml)

 summary(aml3)
 
 coxph(Surv(time,status)~x,data=aml)

 ## the same
 coxph(Surv(start,time,status)~x,data=aml3)
 
 ## added to show doubling of records

 aml.so - aml
 aml.so$surv.object - with(aml, Surv(time, status))
 
 aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start,

  event=status,episode=i)
 summary(aml3.so)
 
 sessionInfo('survival')

 R version 2.9.1 Patched (2009-07-07 r48910)
 i386-pc-mingw32
 
 locale:

 
LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252
 
 attached base packages:

 character(0)
 
 other attached packages:

 [1] survival_2.35-4
 
 loaded via a namespace (and not attached):

 [1] base_2.9.1  graphics_2.9.1  grDevices_2.9.1 methods_2.9.1
 [5] splines_2.9.1   stats_2.9.1 utils_2.9.1
 
 __

 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html

 and provide commented, minimal, self-contained, reproducible code.

Charles C. Berry(858) 534-2098
 Dept of 
Family/Preventive Medicine

E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901


Thank you Chuck,

it seems to work also with my real data, but I noted that in the example 
aml$x, which is a factor, gets converted to character in aml3.so. Maybe, if I 
find the time, I should look at as.data.frame.matrix and rbind for Surv 
objects.





Heinz,

Try

newdata - lapply(data, function(x) {
if (!is.matrix(x)) {
rep(x,ntimes +1 )
} else {
x - as.matrix(x)
x[rep(1:nrow(x), ntimes + 1),]}})


HTH,

Chuck


Thanks again,
Heinz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] survSplit with data.frame containing a Surv object