[R] Help With Fleiss Kappa

2009-07-13 Thread mehdi ebrahim
Hi All,

I am using fleiss kappa for inter rater agreement.  Are there any know
issues with Fleiss kappa calculation in R?  Even when I supply mock data
with total agreement among the raters I do not get a kappa value of 1.
instead I am getting negative values.

I am using the irr package version 0.70

Any help is much appreciated.

Thanks and Regards

M

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] arr.lcol in plotmat from diagram package

2009-07-13 Thread Soetaert, Karline
Rajesh,

A quick reply to your questions concerning the diagram package:

To use different colors per arrow: just define arr.col as a
Matrix or a data.frame, and give the color number or colorname to each arrow:

To use numbers:

  M - matrix(nrow=4,ncol=4,data=0)
  M[2,1]-1  ;M[4,2]-2;M[3,4]-3;M[1,3]-4
  pp-plotmat(M,pos=c(1,2,1),curve=0.2,name=letters[1:4],lwd=1,box.lwd=2,
  cex.txt=0.8,arr.type=triangle,box.size=0.1,box.type=rect,
  box.prop=0.5,arr.len=0.6,arr.col=M)

To use color names:

  arr.col - as.data.frame(M)
  arr.col[2,1]-red
  arr.col[4,2]-black
  arr.col[3,4]-darkgreen
  arr.col[1,3]-orange
   pp-plotmat(M,pos=c(1,2,1),curve=0.2,name=letters[1:4],lwd=1,box.lwd=2,
  cex.txt=0.8,arr.type=triangle,box.size=0.1,box.type=rect,
  box.prop=0.5, arr.len=0.8,arr.col=arr.col)


Karline

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for (n in SystemResults$EnTime) return EnTime[n] until reaching (all)

2009-07-13 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 12.07.2009 22:24:29:

 On Sun, Jul 12, 2009 at 1:05 PM, David Winsemiusdwinsem...@comcast.net 
wrote:
 
  On Jul 12, 2009, at 3:35 PM, David Winsemius wrote:
 
 
  On Jul 12, 2009, at 2:53 PM, Mark Knecht wrote:
 
 
   As a test I tried to print down to the string (all) and then
  break but this code and everything I've tried so far is terribly
  wrong. Every attempt prints lots of error messages. I'm not grasping
  at all what I'm doing wrong or what's the right way to do this sort 
of
  thing. Clearly my first for loop isn't a success!
 
  for(n in SystemResults$EnTime)  {
 if(SystemResults$EnTime[n] == (all)) break)
 
  Inside the loop, shouldn't you be comparing to n?? As you have it 
now,
  the values of that factor are probably being used as indices to 
itself. (Not
  good.) Also not good is the use of break.  It looks to be fairly 
severely
  deprecated at this point
 
  Appears I am wrong about this. I was basing my assumption on this
  interaction with the R interpreter:
  ?break
  Error in genericForPrimitive(f) :
   methods may not be defined for primitive function break in this 
version
  of R
 
  But:
 
  ?Control ... suggests that break-ing out of for loops remains 
acceptable.
 
 
  David Winsemius, MD
  Heritage Laboratories
  West Hartford, CT
 
 
 
 Hi David,
Thanks for the response. It is helping.
 
I think the break is required as your suggestion doesn't exit the
 loop i there is more data like mine. It just skips printing the (all)
 but incorrectly prints the other copies down lower in the data frame:
 
   tf - factor(c(53 ,  906  , 919  , 932 ,  945 ,  958 ,  1011 , 1024 , 

 (all), 53 ,  906  , 919  , 932 ,  945 ,  958 ,  1011 , 1024 , (all) 
) )
   for(n in tf ) {if (n != (all))  print(n)}
 [1] 53
 [1] 906
 [1] 919
 [1] 932
 [1] 945
 [1] 958
 [1] 1011
 [1] 1024
 [1] 53
 [1] 906
 [1] 919
 [1] 932
 [1] 945
 [1] 958
 [1] 1011
 [1] 1024
 
 
 whereas the else break gets me out:
 
   tf - factor(c(53 ,  906  , 919  , 932 ,  945 ,  958 ,  1011 , 1024 , 

 (all), 53 ,  906  , 919  , 932 ,  945 ,  958 ,  1011 , 1024 , (all) 
) )
   for(n in tf ) {if (n != (all))  print(n) else break}
 [1] 53
 [1] 906
 [1] 919
 [1] 932
 [1] 945
 [1] 958
 [1] 1011
 [1] 1024
 
 
 My confusion here is really how the 'n' is being used. I thought it
 was just an index - a number that gets used inside of the curly braces
 like other languages I've used. It seems it isn't that at all but
 really operates as something that returns the actual value of the
 position in the factor. I was trying to reference the location in the
 list but for is already returning the value. Strange, but I'm sure
 there are good reasons. Please note that I anot a prgrammer and have
 no formal training so it hardly matters what I think! :-)
 
 I'm now wondering if I'd be better off to try using ?match to find the
 first position of (all) instead of using the for loop? If match

like that?

min(which(tf %in% (all)))

Regards
Petr


 returned a number then I think I'd be more comfortable, but maybe I
 should keep going the way I am.
 
 It seems I'm maybe getting the getting the correct answer now but I'm
 concerned that it's coming back with quotes. I can get around that
 using
 
 print(as.integer(x))
 
 but all this coercion stuff that R is doing is giving me fits. I just
 don't have my head around it yet. None the less R is already giving me
 visibility into my data that I've not had before so overall the
 results are strongly positive.
 
 Thanks,
 Mark
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] simulation in multilevel model

2009-07-13 Thread Zoltan
Dear List Members,

I am running a random intercept and random slope (2 cross-level
interactions) MLM using package lme4 (individuals nested in countries,
number of 2nd level units is 21). My DV (member of a trade union or not) is
dichotomous, and thus it is a logistic multilevel model. The model converges
and the results are significant. I would like to ask what solution can be
used, if I want to simulate the model results for a hypothetical individual.
More precisely, I am interested in comparing the probabilities of being a
trade union member across countries, using a hypothetical individual. For
this, I have the values for all the independent variables, but a simple
calculation based on the coefficients is not enough, since it does not
encompass the random effects in the calculation of the probabilities.

What is the solution to estimate these log odds, using the model
coefficients and still including tha random effects?

Thank you,
-- 
Zoltan Fazekas
MA Political Science Department
CEU, Budapest

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems in plotting with abline

2009-07-13 Thread anupam sinha
Dear R-users,
 I am using R(a package igraph) to calculate certain
topological features of networks. When I try to draw a plot between these
features I get an error. Following is the code I am using :

* plot(met_eco_deg,met_eco_bet)
 lmout-lm(met_eco_bet ~ met_eco_deg)
 abline(lmout)
Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) :
  plot.new has not been called yet*

*met_eco_deg* and *met_eco_bet* are the two objects generated from igraph
package. I don't get any error when I just use the plot function. Can anyone
help me out ?Thanks in advance.


Regards,

Anupam Sinha

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems in plotting with abline

2009-07-13 Thread anupam sinha
Thanks a million for your suggestion. It works.

On Mon, Jul 13, 2009 at 2:29 PM, Alain Guillet
alain.guil...@uclouvain.bewrote:

 Hello,

 The error message says there is no opened graphical windows. You cannot use
 abline if there is no open graphical windows. Try this:

 plot(met_eco_deg,met_eco_bet)
 abline(lmout)

 Regards,
 Alain




 anupam sinha wrote:

 Dear R-users,
 I am using R(a package igraph) to calculate certain
 topological features of networks. When I try to draw a plot between these
 features I get an error. Following is the code I am using :

 * plot(met_eco_deg,met_eco_bet)


 lmout-lm(met_eco_bet ~ met_eco_deg)
 abline(lmout)


 Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) :
  plot.new has not been called yet*

 *met_eco_deg* and *met_eco_bet* are the two objects generated from igraph
 package. I don't get any error when I just use the plot function. Can
 anyone
 help me out ?Thanks in advance.


 Regards,

 Anupam Sinha

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Alain Guillet
 Statistician and Computer Scientist

 SMCS - Institut de statistique - Université catholique de Louvain
 Bureau d.126
 Voie du Roman Pays, 20
 B-1348 Louvain-la-Neuve
 Belgium

 tel: +32 10 47 30 50



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: Problems in plotting with abline

2009-07-13 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 13.07.2009 10:45:50:

 Dear R-users,
  I am using R(a package igraph) to calculate certain
 topological features of networks. When I try to draw a plot between 
these
 features I get an error. Following is the code I am using :
 
 * plot(met_eco_deg,met_eco_bet)
  lmout-lm(met_eco_bet ~ met_eco_deg)
  abline(lmout)
 Error in int_abline(a = a, b = b, h = h, v = v, untf = untf, ...) :
   plot.new has not been called yet*
 
 *met_eco_deg* and *met_eco_bet* are the two objects generated from 
igraph
 package. I don't get any error when I just use the plot function. Can 
anyone
 help me out ?Thanks in advance.

I do not have any experience in igraph but the error from abline states 
that plot is not initialised. I suspected that igraph is based on grid 
graphics and in this case you can not easily mix base and grid graphics, 
however from docs it seems that plain igraph plots are based on base 
graphics therefore it shall work.

If your syntax is really only those 3 lines one after another do you see a 
plot after plot(met_eco_deg,met_eco_bet)? If not this is the problem. If 
after plot some plot appears and abline issues an error, than the problem 
is not so simple and without reproducible example it would be quite tricky 
to find out what is going on.

Regards
Petr

 
 
 Regards,
 
 Anupam Sinha
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Combine two matricies

2009-07-13 Thread Tom Liptrot

Hi,

I have two matricies a and x:

a-matrix(c(3,4,5,2,3,4,1,1,2), nrow=3, ncol=3)

 [,1] [,2] [,3]
[1,]321
[2,]431
[3,]542

x-matrix(c(3, NA, NA, NA, 2, 5, NA, 2, 2), nrow=3, ncol=3)

 [,1] [,2] [,3]
[1,]3   NA   NA
[2,]   NA22
[3,]   NA52

I wish to combine these two into one matrix using the values from x where x has 
values, and values from a where x has NA's, giving a new matrix which would 
look like this:

ax-matrix(c(3,4,5,2,2,5,1,2,2), nrow=3, ncol=3)

[,1] [,2] [,3]
[1,]321
[2,]422
[3,]552

I want an automatic way of doing this as my actual application is a much larger 
matrix.

Thanks in advance

Tom


_

[[elided Hotmail spam]]


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combine two matricies

2009-07-13 Thread Dimitris Rizopoulos

try this:

a - matrix(c(3,4,5,2,3,4,1,1,2), nrow=3, ncol=3)
x - matrix(c(3, NA, NA, NA, 2, 5, NA, 2, 2), nrow=3, ncol=3)

ind - is.na(x)
x[ind] - a[ind]
x


I hope it helps.

Best,
Dimitris


Tom Liptrot wrote:

Hi,

I have two matricies a and x:

a-matrix(c(3,4,5,2,3,4,1,1,2), nrow=3, ncol=3)

 [,1] [,2] [,3]
[1,]321
[2,]431
[3,]542

x-matrix(c(3, NA, NA, NA, 2, 5, NA, 2, 2), nrow=3, ncol=3)

 [,1] [,2] [,3]
[1,]3   NA   NA
[2,]   NA22
[3,]   NA52

I wish to combine these two into one matrix using the values from x where x has 
values, and values from a where x has NA's, giving a new matrix which would 
look like this:

ax-matrix(c(3,4,5,2,2,5,1,2,2), nrow=3, ncol=3)

[,1] [,2] [,3]
[1,]321
[2,]422
[3,]552

I want an automatic way of doing this as my actual application is a much larger 
matrix.

Thanks in advance

Tom


_

[[elided Hotmail spam]]


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combine two matricies

2009-07-13 Thread Michael Knudsen
On Mon, Jul 13, 2009 at 11:31 AM, Tom Liptrot tomlipt...@hotmail.com wrote:

 I wish to combine these two into one matrix using the values from x where x 
 has values, and values from a where x has NA's, giving a new matrix which 
 would look like this:

This should do the trick:

x[which(is.na(x))]=a[which(is.na(x))]

--
Michael Knudsen
micknud...@gmail.com
http://www.google.com/profiles/micknudsen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combine two matricies

2009-07-13 Thread Paul Chatfield

The following code should do it.  This assumes matrices a and x are of the
same dimension which is why  you can index a as below

x[is.na(x)==TRUE]-a[is.na(x)==TRUE]

-- 
View this message in context: 
http://www.nabble.com/Combine-two-matricies-tp24458609p24458797.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Add grand mean to every entry in a matrix

2009-07-13 Thread Tom Liptrot

Hi,

I have a matrix:

mymat - matrix(runif(10*4), ncol=4)

I wish to subtract the column means, down the colums, subtract the row means 
from the rows and add back the grand mean - all the means should be the means 
of mymat rather than of the new matrix. 

How can I do this?

Any help much appreciated.

Thanks

Tom

_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: Combine two matricies

2009-07-13 Thread Petr PIKAL
Hi

matrix is virtually a vector so you can find index of values of x and add 
those values to propper places in ax

ax - a
idx - which(!is.na(x))
ax[idx] - x[idx]
ax
 [,1] [,2] [,3]
[1,]321
[2,]422
[3,]552

Regards
Petr

r-help-boun...@r-project.org napsal dne 13.07.2009 11:31:12:

 
 Hi,
 
 I have two matricies a and x:
 
 a-matrix(c(3,4,5,2,3,4,1,1,2), nrow=3, ncol=3)
 
  [,1] [,2] [,3]
 [1,]321
 [2,]431
 [3,]542
 
 x-matrix(c(3, NA, NA, NA, 2, 5, NA, 2, 2), nrow=3, ncol=3)
 
  [,1] [,2] [,3]
 [1,]3   NA   NA
 [2,]   NA22
 [3,]   NA52
 
 I wish to combine these two into one matrix using the values from x 
where x 
 has values, and values from a where x has NA's, giving a new matrix 
which 
 would look like this:
 
 ax-matrix(c(3,4,5,2,2,5,1,2,2), nrow=3, ncol=3)
 
 [,1] [,2] [,3]
 [1,]321
 [2,]422
 [3,]552
 
 I want an automatic way of doing this as my actual application is a much 
larger matrix.
 
 Thanks in advance
 
 Tom
 
 
 _
 
 [[elided Hotmail spam]]
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Add grand mean to every entry in a matrix

2009-07-13 Thread Gavin Simpson
On Mon, 2009-07-13 at 09:06 +, Tom Liptrot wrote:
 Hi,
 
 I have a matrix:
 
 mymat - matrix(runif(10*4), ncol=4)
 
 I wish to subtract the column means, down the colums, subtract the row
 means from the rows and add back the grand mean - all the means should
 be the means of mymat rather than of the new matrix. 
 
 How can I do this?
 
 Any help much appreciated.

See ?sweep as one way to remove a set of statistics from a matrix. To
compute the statistics to sweep, you should look at ?rowMeans
and ?colMeans, plus ?mean (for the overall mean of the matrix).

The function below encapsulates the various steps that will do the
manipulation for you. I've added a conversion for the input object if it
is a data frame as mean() works differently on a matrix compared to a
df. (Your actual data is more likely to be in a data frame than a matrix
initially.)

set.seed(1)
mymat - matrix(runif(10*4), ncol=4)

grandMean - function(m) {
 if((df - is.data.frame(m)))
 m - data.matrix(m)
 rm - rowMeans(m)
 cm - colMeans(m)
 gm - mean(m)
 m - sweep(m, 1, rm, -) # 1 means from the rows
 m - sweep(m, 2, cm, -) # 2 means from the cols
 m - m + gm # here we treat m as a vector
 if(df) # return to a data.frame if one originally
 m - as.data.frame(m)
 return(m)
}

grandMean(mymat)
grandMean(as.data.frame(mymat))

HTH

G
 
 Thanks
 
 Tom

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to perform a Likelihood-ratio test?

2009-07-13 Thread Lars Bergemann

Hi.

 

I would like to use a likelihood-ratio test to compare a linear model (lm()) to 
another linear model containing the first one to see if the extra factors are 
needed - but I was unable to find any help on how to do that.

Could you help me and tell me what to do? Or tell me where to find help on this 
topic?

 

Many thanks in advance!

 

Lars

_

s. It's easy!

aspxmkt=en-us
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bug in package.skeleton, R 2.9.0?

2009-07-13 Thread Daniel Klevebring
Dear all,

I am using package.skeleton to build a small packages of misc function  
for personal use. I have recently discovered that the option  
force=TRUE doesn't seem to do what is meant to do. Here's what I'm  
doing:

  setwd(/Users/danielk/Documents/R/packages/dk)
  files - paste(codebase, dir(codebase, pattern=.R), sep=/)
  package.skeleton(name=dk, force=TRUE, code_files=files)
Creating directories ...
Creating DESCRIPTION ...
Creating Read-and-delete-me ...
Copying code files ...
Making help files ...
Done.
Further steps are described in './dk/Read-and-delete-me'.
 

Now, everything seems fine, but changes to files in me codebase  
folder, doesn't come along if the folder dk/R already contains the  
files, even though I use force=TRUE. If I remove the dk/R folder or  
the dk folder altogether, the changes come along so to me it seems  
that it's the overwrite part that doesn't work as it should - or am I  
doing something wrong here? To me, it seems that the function  
safe.dir.create (which is defined in package.skeleton never overwrites  
folders, yielding force=TRUE useless.

See below for sessionInfo.

Thanks a bunch
Daniel



  sessionInfo()
R version 2.9.0 (2009-04-17)
i386-apple-darwin8.11.1

locale:
en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.9.0
 


--

Contact information:

Daniel Klevebring
M. Sc. Eng., Ph.D. Student
Dept of Gene Technology
Royal Institute of Technology, KTH
SE-106 91 Stockholm, Sweden

Visiting address: Roslagstullsbacken 21, B3
Delivery address: Roslagsvägen 30B, 104 06, Stockholm
Invoice address: KTH Fakturaserice, Ref DAKL KTHBIO, Box 24075,  
SE-10450 Stockholm
E-mail: dan...@biotech.kth.se
Phone: +46 8 5537 8337 (Office)
Phone: +46 704 71 65 91 (Mobile)
Web: http://www.biotech.kth.se/genetech/index.html
Fax: +46 8 5537 8481

--

Contact information:

Daniel Klevebring
M. Sc. Eng., Ph.D. Student
Dept of Gene Technology
Royal Institute of Technology, KTH
SE-106 91 Stockholm, Sweden

Visiting address: Roslagstullsbacken 21, B3
Delivery address: Roslagsvägen 30B, 104 06, Stockholm
Invoice address: KTH Fakturaserice, Ref DAKL KTHBIO, Box 24075,  
SE-10450 Stockholm
E-mail: dan...@biotech.kth.se
Phone: +46 8 5537 8337 (Office)
Phone: +46 704 71 65 91 (Mobile)
Web: http://www.biotech.kth.se/genetech/index.html
Fax: +46 8 5537 8481
MSN messenger: klevebr...@msn.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pbc data

2009-07-13 Thread Λεωνίδας Μπαντής


Hi there,

Can anyone please help me because I am going to get crazy with the pbc data 
set. I just want to apply simple cox regression in the data set. I am a 
beginner in R but I don't think I am doing anything wrong. 

I have the book of Fleming and Harrington 1990. I perform cox regression by 
typing:

out- coxph(Surv(times/365,status)~log(bili)+log(proth)+edema+log(albumin)+age)
out


Call:
coxph(formula = Surv(times/365, status) ~ log(bili) + log(proth) + 
    edema + log(albumin) + age)


    coef exp(coef) se(coef) z   p
log(bili) 0.8636    2.3716  0.08294 10.41 0.0e+00
log(proth)    2.3868   10.8791  0.76851  3.11 1.9e-03
edema 0.8963    2.4505  0.27141  3.30 9.6e-04
log(albumin) -2.5069    0.0815  0.65292 -3.84 1.2e-04
age   0.0396    1.0404  0.00767  5.16 2.4e-07

Likelihood ratio test=231  on 5 df, p=0  n=416 (2 observations deleted due to 
missingness)

Which is not exactly what fleming presents in table 4.6.3 page 195. Edema coef 
is 0.8592.

I searched the net (please google: modern regression methods for survival data 
stockholm)

and there some slides there. The results of the slides are again slightly 
different and the analysis seems to be in R. So we all disagree. (I notice that 
2 observation are excluded in my analysis, but not in the slides' analysis)

Also I check the age column, in the book the first two elemets are, 58.7652 and 
56.4463 while in R 58.80548 56.48493. Age looks totally different.

Can anyone please help me. Also in the article Survival model predictive 
accuracy and ROC curves , which is free just google it, in page 27 the results 
seem to be different again. (caution with log(alb) or alb).

Does anyone has any information about this data set that clarifies things for 
me??


Leo.





  
___ 
×ñçóéìïðïéåßôå Yahoo!; 
ÂáñåèÞêáôå ôá åíï÷ëçôéêÜ ìçíýìáôá (spam); Ôï Yahoo! Mail 
äéáèÝôåé ôçí êáëýôåñç äõíáôÞ ðñïóôáóßá êáôÜ ôùí åíï÷ëçôéêþí 
ìçíõìÜôùí http://login.yahoo.com/config/mail?.intl=gr

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] zero cells in one variable in logistic regression

2009-07-13 Thread anna.bucharova

Dear all.
I am sort of beginner with R. I do logistic regression with binomial
response variable and several continuous and categorical variables. In one
categorical variable, zero cell occures (2x2 table looks like 
7 - 0 
23 - 25
This leads to overestimating of odds ratio and inflated confidence interval
for odds for given variable. The variable is significant in univariate test.
I do not necessarilly need odd ratio, but I need the explained deviance by
this variable and I really want to keep this variable in the model. It
probably matters for explained deviance. How to treat this problem?
Thanks for help, Anna Bucharova
-- 
View this message in context: 
http://www.nabble.com/zero-cells-in-one-variable-in-logistic-regression-tp24458629p24458629.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Times series adjustment

2009-07-13 Thread Poizot Emmanuel

Dear all,

I want make correction depth of a bathymetric data set.
To do so, I have the depth data set sample every second (a depth at each 
second) in one hand, and in the other hand, I have a tide variation 
level data set sample every 250 ms. The time register in each data sets 
(tide and bathymetric) is express in seconds followinf this format : 
hh:mm:ss.ss
I would like to rectify the depth (bathymetry) by substrating the tide 
at each depth time.

I tried to construct time series using pastecs package but didn't succed

Example of the tide data set:
Time Correction
11:55:07.58  -2.64
11:55:07.68  -2.64
11:55:07.88  -2.64
11:55:08.28  -2.64
11:55:08.38  -2.64

Example of the bathymetric data:
XY Date   Time  Depth
-2.620748 48.75121 01/00/00 12:07:16.000 -25.6
-2.620714 48.75121 01/00/00 12:07:17.000 -25.8
-2.620698 48.75121 01/00/00 12:07:17.000 -26.0
-2.620682 48.75121 01/00/00 12:07:18.000 -26.1
-2.620667 48.75121 01/00/00 12:07:18.000 -26.1

How can I construct the time series based on that two data sets allowing 
me then to to the depth rectification ?

Regards

--
Cordialement


Emmanuel Poizot
Cnam/Intechmer
B.P. 324
50103 Cherbourg Cedex

Phone (Direct) : (00 33)(0)233887342
Fax : (00 33)(0)233887339


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] zero cells in one variable in logistic regression

2009-07-13 Thread David Winsemius


On Jul 13, 2009, at 5:37 AM, anna.bucharova wrote:



Dear all.
I am sort of beginner with R. I do logistic regression with binomial
response variable and several continuous and categorical variables.  
In one

categorical variable, zero cell occures (2x2 table looks like
7 - 0
23 - 25
This leads to overestimating of odds ratio and inflated confidence  
interval
for odds for given variable. The variable is significant in  
univariate test.
I do not necessarilly need odd ratio, but I need the explained  
deviance by

this variable and I really want to keep this variable in the model. It
probably matters for explained deviance. How to treat this problem?
Thanks for help, Anna Bucharova
--


You might consider glmrob in package:robustbase. See 
http://www.jstatsoft.org/v10/i04

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Times series adjustment

2009-07-13 Thread Gabor Grothendieck
Try this:


# First read in the data using chron times class

Lines.Bary - XY Date   Time  Depth
-2.620748 48.75121 01/00/00 12:07:16.000 -25.6
-2.620714 48.75121 01/00/00 12:07:17.000 -25.8
-2.620698 48.75121 01/00/00 12:07:17.000 -26.0
-2.620682 48.75121 01/00/00 12:07:18.000 -26.1
-2.620667 48.75121 01/00/00 12:07:18.000 -26.1

Lines.Depth - 
Time Correction
11:55:07.58  -2.64
11:55:07.68  -2.64
11:55:07.88  -2.64
11:55:08.28  -2.64
11:55:08.38  -2.64

library(chron)
DF.Depth - read.table(textConnection(Lines.Depth), header = TRUE, as.is = TRUE)
DF.Depth$Time - times(DF.Depth$Time)

DF.Bary - read.table(textConnection(Lines.Bary), header = TRUE, as.is = TRUE)
DF.Bary$Time - times(DF.Bary$Time)

# Now perform the calculation using findInterval

DF.Bary$Depth - DF.Bary$Depth +
DF.Depth[findInterval(DF.Bary$Time, DF.Depth$Time), Correction]


On Mon, Jul 13, 2009 at 6:11 AM, Poizot Emmanuelemmanuel.poi...@cnam.fr wrote:
 Dear all,

 I want make correction depth of a bathymetric data set.
 To do so, I have the depth data set sample every second (a depth at each
 second) in one hand, and in the other hand, I have a tide variation level
 data set sample every 250 ms. The time register in each data sets (tide and
 bathymetric) is express in seconds followinf this format : hh:mm:ss.ss
 I would like to rectify the depth (bathymetry) by substrating the tide at
 each depth time.
 I tried to construct time series using pastecs package but didn't succed

 Example of the tide data set:
 Time             Correction
 11:55:07.58  -2.64
 11:55:07.68  -2.64
 11:55:07.88  -2.64
 11:55:08.28  -2.64
 11:55:08.38  -2.64

 Example of the bathymetric data:
 X                Y             Date       Time              Depth
 -2.620748 48.75121 01/00/00 12:07:16.000 -25.6
 -2.620714 48.75121 01/00/00 12:07:17.000 -25.8
 -2.620698 48.75121 01/00/00 12:07:17.000 -26.0
 -2.620682 48.75121 01/00/00 12:07:18.000 -26.1
 -2.620667 48.75121 01/00/00 12:07:18.000 -26.1

 How can I construct the time series based on that two data sets allowing me
 then to to the depth rectification ?
 Regards

 --
 Cordialement

 
 Emmanuel Poizot
 Cnam/Intechmer
 B.P. 324
 50103 Cherbourg Cedex

 Phone (Direct) : (00 33)(0)233887342
 Fax : (00 33)(0)233887339
 


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Partial Correlation

2009-07-13 Thread Moumita Das
Why do we get Partial correlation values greater than 1?

 I have used the default function pcor.mat :--

I have manipulated the default pcor.mat function a bit so ignore tha
variables corr_type,element1_in_no,element2_in_no,P.Please ignore the
“pairwise” section and have a look at athe “listwise ” part i.e else part.

*pcor.mat -
function(x,y,z,method=p,na.rm=T,corr_type,element1_in_no,element2_in_no,P){
*

**

*print(pcor.mat)*

*x - c(x)*

*y - c(y)*

*z - as.data.frame(z)*

*print(z)*

*#print(element1_in_no)*

*#print(element1_in_no)*

**

* *

*if(dim(z)[2] == 0){   *

*stop(There should be given data\n)*

*}*

* *

*data - data.frame(x,y,z)*

**

**

*if(corr_type==pairwise)*

*{*

**

*print(inside pairwise)*

*rxx.z
-P[as.numeric(element1_in_no),as.numeric(element2_in_no)]*

*#print(rxx.z)*

*#print(rxx.z)*

**

*return(rxx.z)*

* *

*}*

*else*

*{*

*print(inside listwise)*

*if(na.rm == T){*

*data = na.omit(data)*

*}*

**

*xdata - na.omit(data.frame(data[,c(1,2)]))
#i1,C1*

*print(printing
xdata...)*

*print(xdata)*

*Sxx - cov(xdata,xdata,m=method)*

*
print(Sxx...)*

*print(Sxx )*

* *

*xzdata - na.omit(data)*

*xdata - data.frame(xzdata[,c(1,2)])*

*zdata - data.frame(xzdata[,-c(1,2)])*

*print(zdata..)*

*print(zdata)*

*Sxz - cov(xdata,zdata,m=method)*

*
print(Sxz.
)*

*print(Sxz)*

* *

*zdata -
na.omit(data.frame(data[,-c(1,2)]))*

*Szz - cov(zdata,zdata,m=method)*

*print(Szz
)*

*print(Szz)*

*#print(new type par corr)*

*#P-partialCorr_matrix(data)*

**

*}*

**

**

*# is Szz positive definite?*

*zz.ev - eigen(Szz)$values*

*if(min(zz.ev)[1]0){*

**

*stop(\'Szz\' is not positive definite!\n)
*

*}*

**

**

*# partial correlation*

*Sxx.z - Sxx - Sxz %*% solve(Szz) %*% t(Sxz)*

*print(Sxx.z)*

*print(qr(Sxx.z))*

*rxx.z - cov2cor(Sxx.z)[1,2]*

**

*return(rxx.z)*

*}*

*Cov2cor function:-*

*cov2cor-function (V) *

*{*

*   print(inside cov2cor)*

*  *

*   *

*   p - (d - dim(V))[1]*

*if (!is.numeric(V) || length(d) != 2L || p != d[2L]) *

*stop('V' is not a square numeric matrix)*

*Is - sqrt(1/diag(V))*

*print(Is)*

*print(Is)*

*if (any(!is.finite(Is))) *

*warning(diag(.) had 0 or NA entries; non-finite result is
doubtful)*

*r - V*

*r[] - Is * V * rep(Is, each = p)*

* print(r)*

* print(r[])*

* *

*r[cbind(1L:p, 1L:p)] - 1*

*r*

**

*}*

* *

Sxx , Sxz , Szz  all these three values I have calculated and they match
with SPSS result.

After that I am not understanding why my partial correlation result doesn’t
match with the SPSS result.

Sxx.z - Sxx - Sxz %*% solve(Szz) %*% t(Sxz)  # how to calculate in
SPSS.After this as you can see in pcor.mat function rxx.z -
cov2cor(Sxx.z)[1,2] is computed.



Don’t know where things are going wrong.**

*I will attach the datas of categories and items which form the categories.*

*Category 1 has items -1,5 *

*Category 2 has items – 2,4,6,7,8,9,11*

*Category 3 has items -3,12,14*

*Category 4 has items -10,13,15*

* Category values are actually mean of the items which form the category.*

Here in pcor.mat 

Re: [R] Splitting dataset for Tuning Parameter with Cross Validation

2009-07-13 Thread Tim


Seems to me if splitting once for all the bias will be big and if splitting 
once for each choice of parameters the variance ill be big.  

In LibSVM, for each choice of (c, gamma),  the searching script grid.py calls 
svm_cross_validation() which has a random split of the dataset. So seems to me 
it is the second method. 

As to the first one, I come to it in Ch 7 Section 10 of The Elements of 
Statistical Learning by Hastie where it says first split the dataset, then 
evaluate validation error CV(alpha) and vary the complexity parameter value 
alpha to find the one giving smallest validation error. It appears to me the 
splitting is once for all choices of the complexity parameter.

Thanks!

--- On Sun, 7/12/09, Tim timlee...@yahoo.com wrote:

 From: Tim timlee...@yahoo.com
 Subject: [R] Splitting dataset for Tuning Parameter with Cross Validation
 To: r-h...@stat.math.ethz.ch
 Date: Sunday, July 12, 2009, 6:58 PM
 
 Hi,
 My question might be a little general.
 
 I have a number of values to select for the complexity
 parameters in some classifier, e.g. the C and gamma in SVM
 with RBF kernel. The selection is based on which values give
 the smallest cross validation error.
 
 I wonder if the randomized splitting of the available
 dataset into folds is done only once for all those choices
 for the parameter values, or once for each choice? And why?
 
 Thanks and regards!
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help With Fleiss Kappa

2009-07-13 Thread David Freedman

If you apply the function to a simple dataframe or show your code, you might
be able to get more accurate help.  I've used the IRR package in the past
and haven't noticed any problems (maybe I overlooked them ?)

david freedman

mehdi ebrahim wrote:
 
 Hi All,
 
 I am using fleiss kappa for inter rater agreement.  Are there any know
 issues with Fleiss kappa calculation in R?  Even when I supply mock data
 with total agreement among the raters I do not get a kappa value of 1.
 instead I am getting negative values.
 
 I am using the irr package version 0.70
 
 Any help is much appreciated.
 
 Thanks and Regards
 
 M
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Help-With-Fleiss-Kappa-tp24456963p24461821.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fliess Kappa

2009-07-13 Thread Sunita22

Hello Everyone

I am calculating Fleiss Kappa, I have 28 raters, 5 Subjects and 5 ratings
(i.e. A dataset with 28 columns and 5 rows). The problem is that there are 2
missing values in the data. (Right now I have manually deleted the values in
csv file and then imported it in R

1) Would it better to replace those with 0 or should those be omitted? By
omission I will be left with only 26 raters.

2) Also when we use na.omit(objectname) it deleted missing values row-wise,
how can data be omitted column-wise?

3) My second problem is that overall agreement comes to zero, whereas the
data is not showing agreement to be close to zero. The ratings 4 and 5 are
being given maximum time (the ratings are 1: total disagreement and 5: Total
Agreement) whereas for 1 and 2 its very less.

Results I got using agree()
 agree(demoA)
 Percentage agreement (Tolerance=0)

 Subjects = 5
   Raters = 26
  %-agree = 0

can any1 point where I might be wrong?

4) Can we calculate percentage of agreement for each subject? If yes how?

Please help me for these queries

Thank you very much in advance
Regards
Sunita
-- 
View this message in context: 
http://www.nabble.com/Fliess-Kappa-tp24460901p24460901.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R in rearranging equations

2009-07-13 Thread stenort

Hi,
can anyone tell me if R can be used to rearrange very complicated equations
in terms of one of the variables?
I have: 

dx/dt = a*b*m*y*(1-x)-r*x

and, having set:

dy/dx = 0, 

need to rearrange in terms of x. The problem I have is that I don't know how
to rearrange equations when the variables are not yet defined (I get
messages warning me that x etc can't be found). I'm not sure if R can
actually do this (though I know programs like Maxima can) but ideally the
solution we get should be in R. I am also aware that this isn't difficult to
solve by hand, but the remaining equations I have to work through have up to
double the number of variables.

Any help will be appreciated!
Stenort
-- 
View this message in context: 
http://www.nabble.com/R-in-rearranging-equations-tp24461619p24461619.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] subset range selections

2009-07-13 Thread Mark Knecht
How would I write the two selections each in a single subset command?

1) Two non-overlapping time ranges I want to collect together - before
10AM and after noon. Should be an OR function:

X = subset(A, t1000)  + subset(A, t1200)

2) One range between two defined times like after 10AM and before
noon. Should be an AND function:

X1 = subset(A, t1000)
X = subset(X1, t1200)

Thanks,
Mark

P.S. - The help system seems very difficult for finding this sort of
information!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Anomaly in sample() function

2009-07-13 Thread Wolfgang Polasek
Hi all

Maybe someone knows a way to solve this anomaly in sample():
I like to compute a sample (n=100) with replications from a population  of
2500 units but if I draw repeated samples from it I dont get what seems to
be a representative sample if I look at other partitions of the population.
Enclosed is the population g99 with 4 columns: (units, partition 1 (site),
partition 2 (type), weights);
and my R program.

The problem: Some categories from  partition 2 (type) which I use to check
for representativeness, deviates up to 20 percentage points from the
population.
As criterion I have computed the mean difference and the SD of the relative
frequencies between sample and pop. What mean deviation is to expect?

Thanks for any ideas,
W. Polasek

dimnames(g99)[[1]] =paste(g99[,1])
s1= g99[paste(sample(g99[,1], 100, F, g99[,4])),1:4]
dim(s1)
j2 =table(s1[,3])   #sample density
j2g= table(g99[,3]) #pop density
chisq.test(j2g,j2)

p2=100*j2g / sum(j2g) #rel. frequency in pop
pd=p2-100* j2/sum(j2)  #difference of rel. frequency between pop and sample
round(rbind(j2g, p2, pd),2)
sum(abs(pd));sd(pd) #look for the 'best' representative sample
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 95% Confidence Intervals for AUC - $auc.samples from the Daim Package

2009-07-13 Thread lara harrup (IAH-P)

Hi

I am trying to perform a bootstrap estimate of classification accuracy of a 
logistic regression using the 'Daim' package in r using the code at the bottom 
of this post, this all works great and I get the .632+ misclassification 
accuracy, specificity, sensitivity, AUC etc etc but what I would like is to 
access the list of AUC for each of the bootstrap samples as I need calculate 
the 95% confidence intervals for the AUC of the ROC curve for this data using 
this model. I was hoping to get this from the 2.5% and 97.5% quartiles of the 
AUCs of the 10,000 bootstrap samples that were calculated.

I have set returnSample to TRUE which I think should mean the data from each 
bootstrap is saved and when I run auc(x) I get the following

 auc(x,auc.samples)
$auc.632p
[1] 0.8296788

$auc.632
[1] 0.8302014

$auc.loob
[1] 0.8216521

$auc.app
[1] 0.8455208

$auc.samples
list()

my question is how do I access the $auc.samples list() or write it to a csv 
file or something similar to get the 2.5% and 96.5% quartile range from? or is 
there another way to get the 95% CI for the AUC?

Many thanks in advance

Lara

lara.har...@bbsrc.ac.uk



library(Daim)

mylda - function(formula,train,test){
model - lda(formula,train)

predict(model,test)$posterior[,pos]
}

x-Daim(OUTBREAK2 ~ COHESION.2 + COHESION.23 + ED.2 + PLAND.12 +
PLAND.2 + PLAND.25 + PLAND.26 + ALTITUDE_MEAN + SLOPE_MEAN + LSI,
model=mylda,data=landscape,labpos=1,
control=Daim.control(method=boot,number=1),returnSample=TRUE,
cutoff=0.71)

auc(x)
summary(x)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset range selections

2009-07-13 Thread jim holtman
1)   X - subset(A, (t  1000) | (t  1200))
2)   X - subset(A, (t  1000)  (t  1200))

On Mon, Jul 13, 2009 at 9:47 AM, Mark Knechtmarkkne...@gmail.com wrote:
 How would I write the two selections each in a single subset command?

 1) Two non-overlapping time ranges I want to collect together - before
 10AM and after noon. Should be an OR function:

 X = subset(A, t1000)  + subset(A, t1200)

 2) One range between two defined times like after 10AM and before
 noon. Should be an AND function:

 X1 = subset(A, t1000)
 X = subset(X1, t1200)

 Thanks,
 Mark

 P.S. - The help system seems very difficult for finding this sort of
 information!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in rearranging equations

2009-07-13 Thread Gabor Grothendieck
rSymPy and Ryacas can do algebraic manipulation.
See
http://rsympy.googlecode.com
http://ryacas.googlecode.com

 library(rSymPy)
Loading required package: rJava
 sympy(var('a b m r y x'))
[1] (a, b, m, r, y, x)
 sympy(solve(a*b*m*y*(1-x)-r*x, x))
[1] [-a*b*m*y/(-r - a*b*m*y)]


 library(Ryacas)
 a - Sym(a); b - Sym(b); m - Sym(m); r - Sym(r)
 x - Sym(x); y - Sym(y)
 Solve(a*b*m*y*(1-x)-r*x, x)
[1] Starting Yacas!
expression(list(x == a * b * m * y/(a * b * m * y + r)))


On Mon, Jul 13, 2009 at 9:23 AM,
stenortsten...@merchantsconnected.org.uk wrote:

 Hi,
 can anyone tell me if R can be used to rearrange very complicated equations
 in terms of one of the variables?
 I have:

 dx/dt = a*b*m*y*(1-x)-r*x

 and, having set:

 dy/dx = 0,

 need to rearrange in terms of x. The problem I have is that I don't know how
 to rearrange equations when the variables are not yet defined (I get
 messages warning me that x etc can't be found). I'm not sure if R can
 actually do this (though I know programs like Maxima can) but ideally the
 solution we get should be in R. I am also aware that this isn't difficult to
 solve by hand, but the remaining equations I have to work through have up to
 double the number of variables.

 Any help will be appreciated!
 Stenort
 --
 View this message in context: 
 http://www.nabble.com/R-in-rearranging-equations-tp24461619p24461619.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ?max (so far...)

2009-07-13 Thread Erich Neuwirth

Belated answer:

A few remarks regarding your questions:

Your running max problem could be solved in the following way:
(which is a soution based o Duncan Murdoch's suggestion,
but a little bit more general.

foldOrbit-function(x,fun){
res-numeric(length(x))
res[1]-x[1]
for (i in 2:length(x)) res[i]-fun(res[i-1],x[i])
res
}


or more generally

applySliding-function(x,fun,winlength=length(x)){
res-numeric(length(x))
for (i in seq_along(x)) {res[i]-fun(x[(max(1,i-winlength+1)):i])}
res
}


foldOrbit(x,max)
will give you the running maxes of vector x.
For max, taking the max of the max of the sequence without the last  
element

and the last element gives the max of the whole sequence.
It also works for min, sum, prod (all these are associative).

applySliding is more general. The second argument is the function you  
want to apply in running mode.
If you do not give the winlength, it will apply the function in  
running mode an give correct result for

nonassociatve functions also.
If you give the winlength, it will only use the last winlength  
elements of the vector.

Examples:
foldOrbit(1:10,max)
applySliding(1:10,max)
applySliding(1:10,max,3)

And now, for something completely different:

You seem to want to combine Excel and R in you work.
Possibly you can make your work easier if you user RExcel,
which is an add-in allowing to use R from within Excel.

Information is available at rcom.univie.ac.at
and there is (half hour long) video demonstrating how to use
R from within Excel.




On Jul 1, 2009, at 10:26 PM, Mark Knecht wrote:

On Wed, Jul 1, 2009 at 12:54 PM, Duncan  
Murdochmurd...@stats.uwo.ca wrote:

On 01/07/2009 1:26 PM, Mark Knecht wrote:


On Wed, Jul 1, 2009 at 9:39 AM, Duncan Murdochmurd...@stats.uwo.ca
wrote:


On 01/07/2009 11:49 AM, Mark Knecht wrote:


Hi,
 I have a data.frame that is date ordered by row number - earliest
date first and most current last. I want to create a couple of new
columns that show the max and min values from other columns *so  
far* -

not for the whole data.frame.

 It seems this sort of question is really coming from my lack of
understanding about how R intends me to limit myself to portions  
of a
data.frame. I get the impression from the help files that the  
generic
way is that if I'm on the 500th row of a 1000 row data.frame and  
want

to limit the search max does to rows 1:500  I should use something
like [1:row] but it's not working inside my function. The idea  
works
outside the function, in the sense I can create tempt1[1:7] and  
the

max function returns what I expect. How do I do this with row?

 Simple example attached. hp should be 'highest p', ll should be
'lowest l'. I get an error message Error in 1:row : NA/NaN  
argument


Thanks,
Mark


SNIP


HighLow = function (MyFrame) {
  temp1 - MyFrame$p[1:row]
  MyFrame$hp - max(temp1) ## Highest p
  temp1 - MyFrame$l[1:row]
  MyFrame$ll - min(temp1) ## Lowest l

  return(MyFrame)
}


You get an error in this function because you didn't define row,  
so R
assumes you mean the function in the base package, and 1:row  
doesn't make

sense.

What you want for the highest so far is the cummax (for  
cumulative

maximum) function.  See ?cummax.

Duncan Murdoch



Duncon,
  OK, thanks. That makes sense, as long as I want the cummax from  
the

beginning of the data.frame. (Which is exactly what I asked for!)

  How would I do this in the more general case if I was looking for
the cummax of only the most recent 50 rows in my data.frame? What  
I'm
trying to get down to is that as I fill in my data.frame I need to  
be

able get a max or min or standard deviation of the previous so many
rows of data - not the whole column - and I'm just not grasping  
how to

do this. Is seems like I should be able to create a data set that's
only a portion of a column while I'm in the function and then take  
the

cummax on that, or use it as an input to a standard deviation, etc.?


What you describe might be called a running max.  The caTools  
package has

a runmax function that probably does what you want.

More generally, you can always write a loop.  They aren't  
necesssrily fast
or elegant, but they're pretty general.  For example, to calculate  
the max
of the previous 50 observations (or fewer near the start of a  
vector), you

could do

x - ... some vector ...

result - numeric(length(x))
for (i in seq_along(x)) {
 result[i] - max( x[ max(1, i-49):i ])
}

Duncan Murdoch



Thanks for the pointer. I'll check it out.

Today I've managed to get pretty much all of my Excel spreadsheet
built in R except for some of the charts. It took me a week and a half
in Excel. This is my 3rd full day with R. Charts are next.

I appreciate your help and the help I've gotten from others. Thanks  
so much.


cheers,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE 

Re: [R] Help With Fleiss Kappa

2009-07-13 Thread William Revelle

At 6:48 PM +1200 7/13/09, mehdi ebrahim wrote:

Hi All,

I am using fleiss kappa for inter rater agreement.  Are there any know
issues with Fleiss kappa calculation in R?  Even when I supply mock data
with total agreement among the raters I do not get a kappa value of 1.
instead I am getting negative values.

I am using the irr package version 0.70

Any help is much appreciated.


Mehdi,
  Are you by any chance giving the function the agreement matrix 
rather than the raw data?  The  kappam.fliess function seems to want 
the ratings rather than the agreement matrix.


Bill


--
William Revelle http://personality-project.org/revelle.html
Professor   http://personality-project.org/personality.html
Department of Psychology http://www.wcas.northwestern.edu/psych/
Northwestern University http://www.northwestern.edu/
Attend  ISSID/ARP:2009   http://issid.org/issid.2009/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] zero cells in one variable in logistic regression

2009-07-13 Thread Vito Muggeo (UniPa)

dear anna,
if you are not interested in point estimate and SE of the parameter of 
the aforementioned categorical variable, I believe the conventional 
glm(..,family=binomial) is correct. In particular, the returned deviance 
is reliable and also it is the relevant likelihood ratio test..


hope this helps,
vito


David Winsemius ha scritto:


On Jul 13, 2009, at 5:37 AM, anna.bucharova wrote:



Dear all.
I am sort of beginner with R. I do logistic regression with binomial
response variable and several continuous and categorical variables. In 
one

categorical variable, zero cell occures (2x2 table looks like
7 - 0
23 - 25
This leads to overestimating of odds ratio and inflated confidence 
interval
for odds for given variable. The variable is significant in univariate 
test.
I do not necessarilly need odd ratio, but I need the explained 
deviance by

this variable and I really want to keep this variable in the model. It
probably matters for explained deviance. How to treat this problem?
Thanks for help, Anna Bucharova
--


You might consider glmrob in package:robustbase. See 
http://www.jstatsoft.org/v10/i04


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--

Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Heckman Selection MOdel Help in R

2009-07-13 Thread Arne Henningsen
On Mon, Jul 13, 2009 at 11:18 AM, Pathak,
Sauravs.patha...@imperial.ac.uk wrote:
 Dear Arne
 I have gone through the paper and I have tried it at my end, I would really 
 appreciate if you could address the following:

 1. Based upon your suggestion I used the following:

 regmod2 - selection(s ~ age + gender + gemedu + gemhinc + es_gdppc +
    imf_pop + estbbo_m, ln_oy5_1 ~ age+ gender+fearfail+gemedu, adpopdata, 
 method = 2step)
 On trying the above( notice that I have changed heckit to selection in 
 the above command, i get the following error message

 Error in coef.probit(result$probit) :
  could not find function coef.maxLik

That's weird. Which versions of R, sampleSelection, and maxLik do you use?

 Before trying the above I tried the following:

 2. When I tried to do the Heckman selection model in stages , the first run 
 was successful, I mean, using the following:

 myProbit- glm(s ~ age + gender + gemedu + gemhinc + es_gdppc +
 +     imf_pop + estbbo_m, family = binomial(link = probit))
 summary(myProbit)

 I am successful upto this point, but

 3. When I try calculating the IMR using the following:
 adpopdata$IMR-invMillsRatio(myProbit)$IMR1

 I get the error below
 Error in `$-.data.frame`(`*tmp*`, IMR, value = c(2.50039945424535,  :
  replacement has 257358 rows, data has 343251

I guess that you have some NAs in the data so that you have the IMRs
not for all observations but only for the observations witout NAs.

R myIMRs - invMillsRatio(myProbit)$IMR1
should work.

 Is there a code to calculate IMR by hand??

Yes, inside invMillsRatio()
However, why do you want to do this?

 what I see is that the number of rows of IMR calculated and the number
 of rows in the actual data set do not match (may be some missing
 value issues, I am not sure, if it is, how to fix it?) and hence IMR could
 not be added to my original data set, how do I fix this and then proceed
 to get correct IMR to use in my outcome equation  (the OLS stage)

 This is really taking a lot of time, I am working on it for weeks, can
 you please help me kindly, If you wish I can send you the data set as well

Please try to fix it yourself.

Arne


 -Original Message-
 From: Arne Henningsen [mailto:arne.henning...@googlemail.com]
 Sent: 13 July 2009 00:56
 To: Pathak, Saurav; r-help@r-project.org; otoo...@ut.ee
 Subject: Re: Heckman Selection MOdel Help in R

 Hi Saurav!

 On Sun, Jul 12, 2009 at 6:06 PM, Pathak,
 Sauravs.patha...@imperial.ac.uk wrote:
 I am new to R, I have to do a 2 step Heckman model, my selection equation is
 below which I was successful in running but I am unable to proceed further,



 I have so far used the following command

 glm(formula = s ~ age + gender + gemedu + gemhinc + es_gdppc +
     imf_pop + estbbo_m, family = binomial(link = probit))

 My question is
 1. How do i discard the non significant selection variables (one out of the
 seven variables above is non-significant) and calculate the Inverse Mills
 Ratio of the significant variables

 2. I need the inverse mills ratio from the above to run the outcome equation
 model using OLS with the Inverse mills ratio calculated on the basis of the
 above probit as the control in my outcome equation,  hence I need to get the
 IMR (Is there another direct way?)

 3. How can this be done in R using my concept or otherwise does there exist
 another way of doing what I wish to achieve



 On trying

 regmod - heckit(s ~ age + gender + gemedu + gemhinc + es_gdppc +

     imf_pop + estbbo_m, ln_oy5_1 ~ age+ gender+fearfail+gemedu,
 adpopdata,method=2step)



 I get

 Error: could not find function heckit



 Error: could not find function invMillsRatio



 Am I missing out something, do i have to install something apart from R
 also, so far I have used



 install.packages( sampleSelection, repos=http://R-Forge.R-project.org; )

 install.packages(Rcmdr, dependencies=TRUE)



 Even then I am unable to run heckit, please help

 You have to install (only once) and *load* the package before you can use it:
 R library( sampleSelection )

 I suggest that you do NOT use function heckit but function
 selection, because the latter allows you to estimate the model both
 by the 2-step and the 1-step (ML) method (depending on argument
 method).

 Our paper about the sampleSelection package published in the JSS could
 be also helpful for you:
 http://www.jstatsoft.org/v27/i07/

 Arne

 --
 Arne Henningsen
 http://www.arne-henningsen.name




-- 
Arne Henningsen
http://www.arne-henningsen.name

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plm Issues

2009-07-13 Thread Damien Moore
Duh! Thanks and good advice. I was using 2.7.2 because it was, until
recently, the latest version working with RPy (http://rpy.sourceforge.net/).
Also didn't realize plm was still actively developed.

Interesting that since plm now correctly handles diff and lag operations, it
actually breaks with the behavior of lm:

 a=ts(c(1,2,4))
 lm(a~diff(a))
Error in model.frame.default(formula = a ~ diff(a), drop.unused.levels =
TRUE) :
  variable lengths differ (found for 'diff(a)')

To regress a on its difference, one needs the more laborious:
 a=ts(c(1,2,4))
 adata=as.data.frame(cbind(a,diff(a)))
 colnames(adata)=c(a,diffa)
 lm(a~diffa,data=adata)

Call:
lm(formula = a ~ diffa, data = adata)

Coefficients:
(Intercept)diffa
  02

From the R help
Fitting Linear ModelsUsing time series

Considerable care is needed when using lm with time series.

Unless na.action = NULL, the time series attributes are stripped from the
variables before the regression is done. (This is necessary as omitting NAs
would invalidate the time series attributes, and if NAs are omitted in the
middle of the series the result would no longer be a regular time series.)

Even if the time series attributes are retained, they are not used to line
up series, so that the time shift of a lagged or differenced regressor would
be ignored. It is good practice to prepare a data argument by
ts.intersectts.union.html(...,
dframe = TRUE), then apply a suitable na.action to that data frame and call
lm with na.action = NULL so that residuals and fitted values are time
series.


On Sat, Jul 11, 2009 at 10:53 PM, milton ruser milton.ru...@gmail.comwrote:

 The first think one need to do when has a so old version, is update it :-)
 After, if the problem remain, try get help with the colleagues.

 best

 milton

 On Thu, Jul 9, 2009 at 10:58 AM, Damien Moore damienlmo...@gmail.comwrote:

 Hi List

 I'm having difficulty understanding how plm should work with dynamic
 formulas. See the commands and output below on a standard data set. Notice
 that the first summary(plm(...)) call returns the same result as the
 second
 (it shouldn't if it actually uses the lagged variable requested). The
 third
 call results in error (trying to use diff'ed variable in regression)

 Other info: I'm running R 2.7.2 on WinXP

 cheers



 *data(Gasoline,package=Ecdat)
 Gasoline_plm-plm.data(Gasoline,c(country,year))
 pdim(Gasoline_plm)
 **Balanced Panel: n=18, T=19, N=342
 *
 *summary(plm(lgaspcar~lincomep,data=Gasoline_plm**))
 **Oneway (individual) effect Within Model

 Call:
 plm(formula = lgaspcar ~ lincomep, data = Gasoline_plm)

 Balanced Panel: n=18, T=19, N=342

 Residuals :
Min.  1st Qu.   Median  3rd Qu. Max.
 -0.40100 -0.08410 -0.00858  0.08770  0.73400

 Coefficients :
 Estimate Std. Error t-value  Pr(|t|)
 lincomep -0.761830.03535 -21.551  2.2e-16 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Total Sum of Squares: 17.061
 Residual Sum of Squares: 6.9981
 Multiple R-Squared: 0.58981
 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981

 ** summary(plm(lgaspcar~lag(lincomep),data=Gasoline_plm))
 **Oneway (individual) effect Within Model

 Call:
 plm(formula = lgaspcar ~ lag(lincomep), data = Gasoline_plm)

 Balanced Panel: n=18, T=19, N=342

 Residuals :
Min.  1st Qu.   Median  3rd Qu. Max.
 -0.40100 -0.08410 -0.00858  0.08770  0.73400

 Coefficients :
  Estimate Std. Error t-value  Pr(|t|)
 lag(lincomep) -0.761830.03535 -21.551  2.2e-16 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Total Sum of Squares: 17.061
 Residual Sum of Squares: 6.9981
 Multiple R-Squared: 0.58981
 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981

 *
 *summary(plm(lgaspcar~diff(lincomep),data=Gasoline_plm))*
 *Error in model.frame.default(formula = lgaspcar ~ diff(lincomep), data =
 mydata,  :
  variable lengths differ (found for 'diff(lincomep)')
 *

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] variance explained by each predictor in GAM

2009-07-13 Thread Kayce Anderson
Many thanks for the advice David. I would really like to figure out, though,
how to get the contribution of each factor to the Rsq - something like a
Beta coefficient for GAM.   Ideas?
KC

On Sun, Jul 12, 2009 at 5:41 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Jul 12, 2009, at 5:06 PM, Kayce Anderson wrote:

  Hi,
 I am using mgcv:gam and have developed a model with 5 smoothed predictors
 and one factor.

 gam1 - gam(log.sp~ s(Spr.precip,bs=ts)  + s(Win.precip,bs=ts) + s(
 Spr.Tmin,bs=ts)  + s(P.sum.Tmin,bs=ts) + s( Win.Tmax,bs=ts)
 +factor(site),data=dat3)


 The total deviance explained = 70.4%.


 I would like to extract the variance explained by each predictor.  Is
 there
 a straightforward way to do this?  I have tried dropping a term and
 recalculating the model, but the edf's change if there is any correlation
 among variables, thereby making all of the relationships different.  I
 haven't yet figured out how to fix the smoothing terms- I get syntax error
 messages.  Among other variations, I tried, for example,
 log.sp~s(Spr.precip, sp=3.9, fx=TRUE) +...



 ?anova.gam

 Obviously I cannot test this with your dat3. You get an F-statistic for
 each s() term by default and you are referred to saummary.gam for further
 explanation.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] weighted cox ph model

2009-07-13 Thread Bessy

Dear all,

I am struggling with the weighted Efron approximation. I really need some
help.

I am confused by the explanation regarding applying weights to Efron
approximation on the paper of “A Package for Survival Analysis in S” (Terry
M. Therneau, Feb 1999).  In the section 3.3.7 Weighted Cox models, it states
that, for a simple example with 5 subjects in time order and 1 and 2 died at
the same time, the second term of the weighted partial likelihood is either
w1r1/ ( w1r1 + w3r3+ w4r4+ w5r5)
or 
w1r1/ (w2r2+ w3r3+ w4r4+ w5r5)

The author suggests that to compute the Efron approximation one can simply
separately replace the numerator with 0.5 (w1r1 + w2r2) and the denominator
with 0.5w1r1 + 0.5w2r2+ w3r3+ w4r4+ w5r5. 

Does this suggestion refer to the second term (stated above) of the
likelihood function only ?

If so should the first term of the likelihood function should still be 
w2r2/ (w1r1 + w2r2+ w3r3+ w4r4+ w5r5)
or 
w1r1/ (w1r1 + w2r2+ w3r3+ w4r4+ w5r5)?

Then to compute the Efron approximation, should the numerator of the this
likelihood function should be 
0.5 (w1r1 + w2r2)* 0.5 (w1r1 + w2r2), 
i.e.
the likelihood function should be
[0.5 (w1r1 + w2r2)/ (w1r1 + w2r2+ w3r3+ w4r4+ w5r5) ]* [0.5 (w1r1 + w2r2)/(
0.5w1r1 + 0.5w2r2+ w3r3+ w4r4+ w5r5)]?

Since we add the probability to the numerator of the second term, we should
add the probability to the numerator of the first term as well. If 1 and 2
each have the probability of 0.5 for the deaths in the second term they
should also have the probability of 0.5 for the deaths in the first term.
That’s my thinking. If it is wrong, please correct it for me. I will really
appreciate it. 

By the way, I tried another weighted Efron approximation, which can yield
the same result with R if and only if the weights for tied data are the
same.  Otherwise it yields the different results but close to the R’s.

The formula has been attached with this message already.
Could someone give me some hints regarding my formula, please? 

Any suggestions will be sincerely appreciated.

Bessy
http://www.nabble.com/file/p24462598/Weighted%2BEfron%2BApproximation_Nabble_R.doc
Weighted+Efron+Approximation_Nabble_R.doc 
-- 
View this message in context: 
http://www.nabble.com/weighted-cox-ph-model-tp24462598p24462598.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset range selections

2009-07-13 Thread Mark Knecht
Thanks Jim.

How does one search the help system for info on simple logic like this?

On Mon, Jul 13, 2009 at 6:59 AM, jim holtmanjholt...@gmail.com wrote:
 1)   X - subset(A, (t  1000) | (t  1200))
 2)   X - subset(A, (t  1000)  (t  1200))

 On Mon, Jul 13, 2009 at 9:47 AM, Mark Knechtmarkkne...@gmail.com wrote:
 How would I write the two selections each in a single subset command?

 1) Two non-overlapping time ranges I want to collect together - before
 10AM and after noon. Should be an OR function:

 X = subset(A, t1000)  + subset(A, t1200)

 2) One range between two defined times like after 10AM and before
 noon. Should be an AND function:

 X1 = subset(A, t1000)
 X = subset(X1, t1200)

 Thanks,
 Mark

 P.S. - The help system seems very difficult for finding this sort of
 information!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plm Issues

2009-07-13 Thread Gabor Grothendieck
The dyn and dynlm packages can handle time series in lm and glm.
(dyn can also handle many additional regression functions as well)
In the case of dyn just write dyn$lm instead of lm like this:

 library(dyn)
 a - ts(c(1, 2, 4))
 dyn$lm(a ~ diff(a))

Call:
lm(formula = dyn(a ~ diff(a)))

Coefficients:
(Intercept)  diff(a)
  02


On Mon, Jul 13, 2009 at 10:17 AM, Damien Mooredamienlmo...@gmail.com wrote:
 Duh! Thanks and good advice. I was using 2.7.2 because it was, until
 recently, the latest version working with RPy (http://rpy.sourceforge.net/).
 Also didn't realize plm was still actively developed.

 Interesting that since plm now correctly handles diff and lag operations, it
 actually breaks with the behavior of lm:

 a=ts(c(1,2,4))
 lm(a~diff(a))
 Error in model.frame.default(formula = a ~ diff(a), drop.unused.levels =
 TRUE) :
  variable lengths differ (found for 'diff(a)')

 To regress a on its difference, one needs the more laborious:
 a=ts(c(1,2,4))
 adata=as.data.frame(cbind(a,diff(a)))
 colnames(adata)=c(a,diffa)
 lm(a~diffa,data=adata)

 Call:
 lm(formula = a ~ diffa, data = adata)

 Coefficients:
 (Intercept)        diffa
          0            2

 From the R help
 Fitting Linear ModelsUsing time series

 Considerable care is needed when using lm with time series.

 Unless na.action = NULL, the time series attributes are stripped from the
 variables before the regression is done. (This is necessary as omitting NAs
 would invalidate the time series attributes, and if NAs are omitted in the
 middle of the series the result would no longer be a regular time series.)

 Even if the time series attributes are retained, they are not used to line
 up series, so that the time shift of a lagged or differenced regressor would
 be ignored. It is good practice to prepare a data argument by
 ts.intersectts.union.html(...,
 dframe = TRUE), then apply a suitable na.action to that data frame and call
 lm with na.action = NULL so that residuals and fitted values are time
 series.


 On Sat, Jul 11, 2009 at 10:53 PM, milton ruser milton.ru...@gmail.comwrote:

 The first think one need to do when has a so old version, is update it :-)
 After, if the problem remain, try get help with the colleagues.

 best

 milton

 On Thu, Jul 9, 2009 at 10:58 AM, Damien Moore damienlmo...@gmail.comwrote:

 Hi List

 I'm having difficulty understanding how plm should work with dynamic
 formulas. See the commands and output below on a standard data set. Notice
 that the first summary(plm(...)) call returns the same result as the
 second
 (it shouldn't if it actually uses the lagged variable requested). The
 third
 call results in error (trying to use diff'ed variable in regression)

 Other info: I'm running R 2.7.2 on WinXP

 cheers



 *data(Gasoline,package=Ecdat)
 Gasoline_plm-plm.data(Gasoline,c(country,year))
 pdim(Gasoline_plm)
 **Balanced Panel: n=18, T=19, N=342
 *
 *summary(plm(lgaspcar~lincomep,data=Gasoline_plm**))
 **Oneway (individual) effect Within Model

 Call:
 plm(formula = lgaspcar ~ lincomep, data = Gasoline_plm)

 Balanced Panel: n=18, T=19, N=342

 Residuals :
    Min.  1st Qu.   Median  3rd Qu.     Max.
 -0.40100 -0.08410 -0.00858  0.08770  0.73400

 Coefficients :
         Estimate Std. Error t-value  Pr(|t|)
 lincomep -0.76183    0.03535 -21.551  2.2e-16 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Total Sum of Squares: 17.061
 Residual Sum of Squares: 6.9981
 Multiple R-Squared: 0.58981
 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981

 ** summary(plm(lgaspcar~lag(lincomep),data=Gasoline_plm))
 **Oneway (individual) effect Within Model

 Call:
 plm(formula = lgaspcar ~ lag(lincomep), data = Gasoline_plm)

 Balanced Panel: n=18, T=19, N=342

 Residuals :
    Min.  1st Qu.   Median  3rd Qu.     Max.
 -0.40100 -0.08410 -0.00858  0.08770  0.73400

 Coefficients :
              Estimate Std. Error t-value  Pr(|t|)
 lag(lincomep) -0.76183    0.03535 -21.551  2.2e-16 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Total Sum of Squares: 17.061
 Residual Sum of Squares: 6.9981
 Multiple R-Squared: 0.58981
 F-statistic: 464.442 on 323 and 1 DF, p-value: 0.036981

 *
 *summary(plm(lgaspcar~diff(lincomep),data=Gasoline_plm))*
 *Error in model.frame.default(formula = lgaspcar ~ diff(lincomep), data =
 mydata,  :
  variable lengths differ (found for 'diff(lincomep)')
 *

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the 

Re: [R] subset range selections

2009-07-13 Thread jim holtman
You can find the operators with:

?|

This will show you logical operators.  You can do:

?+

to get the others.

On Mon, Jul 13, 2009 at 10:21 AM, Mark Knechtmarkkne...@gmail.com wrote:
 Thanks Jim.

 How does one search the help system for info on simple logic like this?

 On Mon, Jul 13, 2009 at 6:59 AM, jim holtmanjholt...@gmail.com wrote:
 1)   X - subset(A, (t  1000) | (t  1200))
 2)   X - subset(A, (t  1000)  (t  1200))

 On Mon, Jul 13, 2009 at 9:47 AM, Mark Knechtmarkkne...@gmail.com wrote:
 How would I write the two selections each in a single subset command?

 1) Two non-overlapping time ranges I want to collect together - before
 10AM and after noon. Should be an OR function:

 X = subset(A, t1000)  + subset(A, t1200)

 2) One range between two defined times like after 10AM and before
 noon. Should be an AND function:

 X1 = subset(A, t1000)
 X = subset(X1, t1200)

 Thanks,
 Mark

 P.S. - The help system seems very difficult for finding this sort of
 information!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting dataset for Tuning Parameter with Cross Validation

2009-07-13 Thread Max Kuhn
Here is what the train() function in the caret package does by default
(you can change this behavior; see below).

Using the entire data set, estimate the RBF parameter using the
sigest() function in the kernlab package (which, if I recall correctly
involves the median of a sample of kernel matrix values).

Using this fixed value, the cost function is varied over a common set
of held-out samples. More specifically, every value of the cost
parameter is evaluated on the same exact folds. I've been able to
achieve pretty good performance this way in almost every case where
I've done the comparison,

Based on these performance values, you can select the cost function
based on the best performance. There are also ways of selecting the
simplest model that is within the uncertainty of the numerically
optimal model (that is done using the selectionFunction argument of
trainControl).

I should also note that you can tune across any grid of cost and sigma
(this is done via the tuneGrid argument of train()).

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issues with file.info?

2009-07-13 Thread Jason Rupert

Yes.  You are right that is the trick:)

Here is what I'm seeing:
input_path-c(C:/WINDOWS/system32)
file.info(list.files(input_path))


size isdir mode mtime ctime atime  exe
$winnt$.inf   NANA NA  NA  NA  NA NA
1025   NANA NA  NA  NA  NA NA
...
xsi.jar  NA NA NA  NA NA NA NA
xsi.zip  NA NA NA  NA NA NA NA
zipfldr.dll NA NA NA NA NA NA NA

input_path-c(C:/WINDOWS/system32/winmine.exe)
file.info(input_path)

  size isdir mode   mtime   
ctime   atime   exe
C:/WINDOWS/system32/winmine.exe 119808 FALSE  777 2002-12-31 07:00:00 
2008-08-15 14:28:37 2009-07-13 09:39:32 win32

Not sure why, file.info is not working for the array of input 
files/folders/etc. but successfully works when an individual file in input.  

Thanks again for your insights and feedback. 



--- On Thu, 7/9/09, Rolf Turner r.tur...@auckland.ac.nz wrote:

 From: Rolf Turner r.tur...@auckland.ac.nz
 Subject: Re: [R] Issues with file.info?
 To: Jason Rupert jasonkrup...@yahoo.com
 Cc: R-help@r-project.org R-help@r-project.org
 Date: Thursday, July 9, 2009, 3:13 PM
 
 On 10/07/2009, at 6:02 AM, Jason Rupert wrote:
 
  
  Are there any tricks associated with file.info?
  
  I just tried it on a directory folder and it returned
 NA for all fields for all files.  I tried it on a
 different folder with different files and it still returned
 NA.
  
  I tried it on a specific file and it returned all the
 proper info correctly.
  
  Just wondering if there are any tricks I've
 overlooked.
 
 Yes.  You've overlooked the trick of telling us about
 the specifics of
 your operating system, version of R, etc., and of showing
 exactly what
 commands you ``tried'', i.e. of reading the Posting Guide.
 
     cheers,
 
         Rolf Turner
 
 
 
 ##
 Attention:This e-mail message is privileged and
 confidential. If you are not theintended recipient please
 delete the message and notify the sender.Any views or
 opinions presented are solely those of the author.
 
 This e-mail has been scanned and cleared by
 MailMarshalwww.marshalsoftware.com
 ##
 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nonlinear Least Squares nls() programming help

2009-07-13 Thread Dieter Menne



MathZero wrote:
 
 Hi, I am trying to use the nls() function to closely approximate a vector
 of values, colC and I'm running into trouble.  I am not sure how if I am
 asking the program to do what I think its doing, because the same
 minimization in Excel's Solver does not run into problems.  
 
 

Fitting a function is an approximation, trying to find a minimum. Think of
frozen mountain lake surrounded by mountains. Excel's Solver will report the
highest tip of the snowflake on the lake, if it finds it. nls will find out
that the lake is essentially flat compare to the surrounding and tell you
this fact in unkind word.

In your case, I assume there is an (additional?) problem that the solution
might be non-identifyable. It could required additional constraints, for
example that one coefficient is always larger than the other.

Dieter


-- 
View this message in context: 
http://www.nabble.com/Nonlinear-Least-Squares-nls%28%29-programming-help-tp24445472p24463097.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] adjusting survival using coxph

2009-07-13 Thread Christopher Jones
I have what I *think* should be a simple problem in R, and hope  
someone might be able to help me.

I'm working with cancer survival data, and would like to calculate  
adjusted survival figures based on the age of the patient and the  
tumour classification. A friendly statistician told me I should use  
Cox proportional hazards to do this, and I've made some progress with  
using the coxph function. However, there doesn't seem to be a simple  
way to get adjusted survival figures, and I'm still a bit unsure  
whether I'm using the function correctly.

I've found plenty of examples of how to plot survival curves for a  
single catagorical variable, like tumour stage. However, its a bit  
less obvious how to properly combine multiple variables (like stage  
and grade) - examples I've found aren't very helpful. For instance,  
can someone explain what these two function calls are doing differently?

   cox.model1 - coxph(Surv(time,outcome)~stage+grade,method=breslow)
   cox.model2 - coxph(Surv(time,outcome)~stage+grade 
+stage:grade,method=breslow)

Also, am I right to think that this is how I would plot the survival  
curves for stage in a 50 year old patient?

   cox.model3 - coxph(Surv(time,outcome)~stage+I 
(age-50),method=breslow)
   plot(survfit(cox.model3,newdata=data.frame(stage2=c 
(III,IV),age=c(50,50

Finally, assuming I manage to correctly call the coxph function in  
the first place (!), what is the sensible way to adjust the survival  
for each patient? My understanding of what I'm trying/hoping to do  
has led me down the following dark alley;

   1) pick a sensible centre point (median age, modal stage/grade) as  
my 'standard' case
   2) obtain the survival probabilities for this 'standard' case from  
the Cox model
   3) for each patient find the probability of their observed  
survival, for their specific age/stage/grade
   4) the adjusted survival is the survival time from the 'standard'  
case that matches the observed survival probability

This seems long-winded, and for all I know might be completely the  
wrong way to do it.  hopefully someone will be able to offer some  
friendly advice!

Apologies for the long-ish post, and thanks in advance for any help.


Chris Jones.


---
Gynaecological Cancer Research Laboratories,
UCL EGA Institute for Women's Health,
University College London,
Paul O'Gorman Building,
72 Huntley Street,
London
WC1E 6DD
United Kingdom

Telephone; 020 3108 2007
Fax; 020 3108 2010


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset range selections

2009-07-13 Thread Duncan Murdoch

On 7/13/2009 10:21 AM, Mark Knecht wrote:

Thanks Jim.

How does one search the help system for info on simple logic like this?


Look in the manuals (the pdf ones), rather than the man pages.  For 
example, expressions like the ones below are in the Introduction to R, 
section 2.4, Logical Vectors.  (The table of contents contains Logical 
vectors, within Simple manipulations.)


Duncan Murdoch



On Mon, Jul 13, 2009 at 6:59 AM, jim holtmanjholt...@gmail.com wrote:

1)   X - subset(A, (t  1000) | (t  1200))
2)   X - subset(A, (t  1000)  (t  1200))

On Mon, Jul 13, 2009 at 9:47 AM, Mark Knechtmarkkne...@gmail.com wrote:

How would I write the two selections each in a single subset command?

1) Two non-overlapping time ranges I want to collect together - before
10AM and after noon. Should be an OR function:

X = subset(A, t1000)  + subset(A, t1200)

2) One range between two defined times like after 10AM and before
noon. Should be an AND function:

X1 = subset(A, t1000)
X = subset(X1, t1200)

Thanks,
Mark

P.S. - The help system seems very difficult for finding this sort of
information!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issues with file.info?

2009-07-13 Thread Duncan Murdoch

On 7/13/2009 10:42 AM, Jason Rupert wrote:

Yes.  You are right that is the trick:)

Here is what I'm seeing:
input_path-c(C:/WINDOWS/system32)
file.info(list.files(input_path))


size isdir mode mtime ctime atime  exe
$winnt$.inf   NANA NA  NA  NA  NA NA
1025   NANA NA  NA  NA  NA NA
...
xsi.jar  NA NA NA  NA NA NA NA
xsi.zip  NA NA NA  NA NA NA NA
zipfldr.dll NA NA NA NA NA NA NA

input_path-c(C:/WINDOWS/system32/winmine.exe)
file.info(input_path)

  size isdir mode   mtime   
ctime   atime   exe
C:/WINDOWS/system32/winmine.exe 119808 FALSE  777 2002-12-31 07:00:00 
2008-08-15 14:28:37 2009-07-13 09:39:32 win32

Not sure why, file.info is not working for the array of input files/folders/etc. but successfully works when an individual file in input.  


Look at what list.files() returns:  there is no path, just the filename. 
 If you want to use file.info, you need the full path, so you'd use


file.info(list.files(input_path,full=TRUE))

(That takes a long time, by the way:  there are a lot of files there, 
and they all need to be opened for inspection!)


Duncan Murdoch


Thanks again for your insights and feedback. 




--- On Thu, 7/9/09, Rolf Turner r.tur...@auckland.ac.nz wrote:


From: Rolf Turner r.tur...@auckland.ac.nz
Subject: Re: [R] Issues with file.info?
To: Jason Rupert jasonkrup...@yahoo.com
Cc: R-help@r-project.org R-help@r-project.org
Date: Thursday, July 9, 2009, 3:13 PM

On 10/07/2009, at 6:02 AM, Jason Rupert wrote:

 
 Are there any tricks associated with file.info?
 
 I just tried it on a directory folder and it returned

NA for all fields for all files.  I tried it on a
different folder with different files and it still returned
NA.
 
 I tried it on a specific file and it returned all the

proper info correctly.
 
 Just wondering if there are any tricks I've

overlooked.

Yes.  You've overlooked the trick of telling us about
the specifics of
your operating system, version of R, etc., and of showing
exactly what
commands you ``tried'', i.e. of reading the Posting Guide.

cheers,

Rolf Turner



##
Attention:This e-mail message is privileged and
confidential. If you are not theintended recipient please
delete the message and notify the sender.Any views or
opinions presented are solely those of the author.

This e-mail has been scanned and cleared by
MailMarshalwww.marshalsoftware.com
##






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] math expressions in graphs

2009-07-13 Thread Rafael Moral
Dear useRs,

I want to put a math expression in the ylab in my plot which should be a Delta 
and a 'y' with a trace and a hat above it and a 't' as subscription.
How could I manage to do it?

Thanks in advance,
Regards,
Rafael.


  

[[elided Yahoo spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] variance explained by each predictor in GAM

2009-07-13 Thread Simon Wood
You can get some idea by doing something like the following, which compares 
the r^2 for models b and b2, i.e. with and without s(x2).  It keeps the 
smoothing parameters fixed for the comparison. (s(x,fx=TRUE) removes 
penalization altogether btw, which is not what was wanted). 

dat - gamSim(1,n=400,dist=normal,scale=2)
b-gam(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat)
b2-gam(y~s(x0)+s(x1)+s(x3),sp=b$sp[-3],data=dat)
summary(b2)$dev.expl
summary(b)$dev.expl


On Monday 13 July 2009 15:09, Kayce Anderson wrote:
 Many thanks for the advice David. I would really like to figure out,
 though, how to get the contribution of each factor to the Rsq - something
 like a Beta coefficient for GAM.   Ideas?
 KC

 On Sun, Jul 12, 2009 at 5:41 PM, David Winsemius 
dwinsem...@comcast.netwrote:
  On Jul 12, 2009, at 5:06 PM, Kayce Anderson wrote:
 
   Hi,
 
  I am using mgcv:gam and have developed a model with 5 smoothed
  predictors and one factor.
 
  gam1 - gam(log.sp~ s(Spr.precip,bs=ts)  + s(Win.precip,bs=ts) + s(
  Spr.Tmin,bs=ts)  + s(P.sum.Tmin,bs=ts) + s( Win.Tmax,bs=ts)
  +factor(site),data=dat3)
 
 
  The total deviance explained = 70.4%.
 
 
  I would like to extract the variance explained by each predictor.  Is
  there
  a straightforward way to do this?  I have tried dropping a term and
  recalculating the model, but the edf's change if there is any
  correlation among variables, thereby making all of the relationships
  different.  I haven't yet figured out how to fix the smoothing terms- I
  get syntax error messages.  Among other variations, I tried, for
  example,
  log.sp~s(Spr.precip, sp=3.9, fx=TRUE) +...
 
  ?anova.gam
 
  Obviously I cannot test this with your dat3. You get an F-statistic for
  each s() term by default and you are referred to saummary.gam for further
  explanation.
 
  David Winsemius, MD
  Heritage Laboratories
  West Hartford, CT

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented, minimal,
 self-contained, reproducible code.

-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] variance explained by each predictor in GAM

2009-07-13 Thread David Winsemius
It appears you are conflating beta coefficients (individual covariate  
effect measures) with overall model fit measures. Beta coefficients  
are not directly comparable to R-squared measures in ordinary least  
squares analyses, so why would they be so in gam models?


I cannot tell whether you actually looked at anova.gam or consulted  
Wood's book (which is really the better place to go for advice  
regarding mgcv function questions. You were offered an apportioned F- 
statistic for each covariate smooth. How was that not satisfactory?  
What is your goal in this search?


--
DW

--
On Jul 13, 2009, at 10:09 AM, Kayce Anderson wrote:

Many thanks for the advice David. I would really like to figure out,  
though, how to get the contribution of each factor to the Rsq -  
something like a Beta coefficient for GAM.   Ideas?


KC

On Sun, Jul 12, 2009 at 5:41 PM, David Winsemius dwinsem...@comcast.net 
 wrote:


On Jul 12, 2009, at 5:06 PM, Kayce Anderson wrote:

Hi,
I am using mgcv:gam and have developed a model with 5 smoothed  
predictors

and one factor.

gam1 - gam(log.sp~ s(Spr.precip,bs=ts)  + s(Win.precip,bs=ts) +  
s(

Spr.Tmin,bs=ts)  + s(P.sum.Tmin,bs=ts) + s( Win.Tmax,bs=ts)
+factor(site),data=dat3)


The total deviance explained = 70.4%.


I would like to extract the variance explained by each predictor.


You may need to define your terms and there may not be an analog

Is there
a straightforward way to do this?  I have tried dropping a term and
recalculating the model, but the edf's change if there is any  
correlation

among variables, thereby making all of the relationships different.  I
haven't yet figured out how to fix the smoothing terms- I get syntax  
error

messages.  Among other variations, I tried, for example,
log.sp~s(Spr.precip, sp=3.9, fx=TRUE) +...



?anova.gam

Obviously I cannot test this with your dat3. You get an F-statistic  
for each s() term by default and you are referred to saummary.gam  
for further explanation.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] object .trPaths not found

2009-07-13 Thread Knut Krueger

Uwe Ligges schrieb:




I have to admit that I have no idea what we are talking about here 
(yes, I tend to forget many things these days) - and you have not 
cited the original message, unfortunately (nor have you specifies R 
versions, Tinn-R versions and both OS versions, but just one) ...


Best wishes,
Uwe

Original question
rkevinbur...@charter.net wrote:
I am running an R script with Tinn-R (2.2.0.1) and I get the error 
message


Error in source(.trPaths[4], echo = TRUE, max.deparse.length = 150) : 
  object .trPaths not found


Any solutions?

System

R 2.8.1
Tinn-R 2.2.02
windows Xp professional Servicepack 3

This combination works on my Pc without problems
And the code is working on the other system with Tinn-R problems
with the R-Editor without problems
so it seems to be a Tinn_R problem

but maybe anybody has a solution for that.

Regards Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] math expressions in graphs

2009-07-13 Thread Jorge Ivan Velez
Dear Rafael,
Try the following which seems to be close to what you asked for:

plot(1, ylab = expression( Delta ~ hat(y) [t] ) )


HTH,

Jorge



On Mon, Jul 13, 2009 at 10:56 AM, Rafael Moral
rafa_moral2...@yahoo.com.brwrote:

 Dear useRs,

 I want to put a math expression in the ylab in my plot which should be a
 Delta and a 'y' with a trace and a hat above it and a 't' as subscription.
 How could I manage to do it?

 Thanks in advance,
 Regards,
 Rafael.



  
 
 [[elided Yahoo spam]]

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lattice xyplot: same scales within one factor

2009-07-13 Thread OB
I am using R 2.8.1 and lattice to produce xyplots conditioned on
two factors. What I would like is to have the scales be free between values
of one factor, but some within. Thus, in this example,

xyplot(mpg ~ disp | factor(gear) + factor(cyl), mtcars,
    scales=list(x=list(relation=free)))

rather than having the x scales be free within a gear as well, I want it to
be the same for the gear, but free between cyl (and thus only have x scales
below the bottom panels with no scales or white space in the middle between
panels).

Any help would be greatly appreciated!

-Orion

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dse model setup help

2009-07-13 Thread Bob McCall
I tried to specify a model in dse1 but something isn't right. Anybody
have any tips?

 model-SS(F=f,G=g,H=h,Q=q,z0=z,P0=p)
Error in locateSS(model$R, constants$R, R, p, p, plist) :
  The dimension of something in the SS model structure is bad.

 dim(f)
[1] 5 5
 dim(g)
[1] 5 1
 dim(h)
[1] 1 5
 dim(q)
[1] 5 5
 dim(z)
[1] 5 1
 dim(p)
[1] 5 5

thanks,
Bob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] survSplit with data.frame containing a Surv object

2009-07-13 Thread Heinz Tuechler

Dear All,

since years I am struggling with Surv objects in data.frames. The 
following seems to have to do with it.
See below the modified example from the help page of survSplit. The 
original works, as expected. If, however, a Surv object is added to 
the data.frame, each record gets doubled.

Is there some solution other than avoiding Surv objects in data.frames?

Thanks,
Heinz


require(survival)

## from the help page
aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start,
  event=status,episode=i)

summary(aml)
summary(aml3)

coxph(Surv(time,status)~x,data=aml)
## the same
coxph(Surv(start,time,status)~x,data=aml3)

## added to show doubling of records
aml.so - aml
aml.so$surv.object - with(aml, Surv(time, status))

aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start,
 event=status,episode=i)
summary(aml3.so)

sessionInfo('survival')
R version 2.9.1 Patched (2009-07-07 r48910)
i386-pc-mingw32

locale:
LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252

attached base packages:
character(0)

other attached packages:
[1] survival_2.35-4

loaded via a namespace (and not attached):
[1] base_2.9.1  graphics_2.9.1  grDevices_2.9.1 methods_2.9.1
[5] splines_2.9.1   stats_2.9.1 utils_2.9.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] variance explained by each predictor in GAM

2009-07-13 Thread Kayce Anderson
Simon,That produced exactly what I was looking for.  Thanks so much for the
humble help.

KC

On Mon, Jul 13, 2009 at 9:10 AM, Simon Wood s.w...@bath.ac.uk wrote:

 You can get some idea by doing something like the following, which compares
 the r^2 for models b and b2, i.e. with and without s(x2).  It keeps the
 smoothing parameters fixed for the comparison. (s(x,fx=TRUE) removes
 penalization altogether btw, which is not what was wanted).

 dat - gamSim(1,n=400,dist=normal,scale=2)
 b-gam(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat)
 b2-gam(y~s(x0)+s(x1)+s(x3),sp=b$sp[-3],data=dat)
 summary(b2)$dev.expl
 summary(b)$dev.expl


 On Monday 13 July 2009 15:09, Kayce Anderson wrote:
  Many thanks for the advice David. I would really like to figure out,
  though, how to get the contribution of each factor to the Rsq - something
  like a Beta coefficient for GAM.   Ideas?
  KC
 
  On Sun, Jul 12, 2009 at 5:41 PM, David Winsemius
 dwinsem...@comcast.netwrote:
   On Jul 12, 2009, at 5:06 PM, Kayce Anderson wrote:
  
Hi,
  
   I am using mgcv:gam and have developed a model with 5 smoothed
   predictors and one factor.
  
   gam1 - gam(log.sp~ s(Spr.precip,bs=ts)  + s(Win.precip,bs=ts) +
 s(
   Spr.Tmin,bs=ts)  + s(P.sum.Tmin,bs=ts) + s( Win.Tmax,bs=ts)
   +factor(site),data=dat3)
  
  
   The total deviance explained = 70.4%.
  
  
   I would like to extract the variance explained by each predictor.  Is
   there
   a straightforward way to do this?  I have tried dropping a term and
   recalculating the model, but the edf's change if there is any
   correlation among variables, thereby making all of the relationships
   different.  I haven't yet figured out how to fix the smoothing terms-
 I
   get syntax error messages.  Among other variations, I tried, for
   example,
   log.sp~s(Spr.precip, sp=3.9, fx=TRUE) +...
  
   ?anova.gam
  
   Obviously I cannot test this with your dat3. You get an F-statistic for
   each s() term by default and you are referred to saummary.gam for
 further
   explanation.
  
   David Winsemius, MD
   Heritage Laboratories
   West Hartford, CT
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html and provide commented,
 minimal,
  self-contained, reproducible code.

 --
  Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
  +44 1225 386603  www.maths.bath.ac.uk/~sw283


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to perform a Likelihood-ratio test?

2009-07-13 Thread Jeff Xu

Hi Lars,

Using the definition of Likelihood-ratio test is straightforward, 

Jeff

Lars Bergemann wrote:
 
 
 Hi.
 
  
 
 I would like to use a likelihood-ratio test to compare a linear model
 (lm()) to another linear model containing the first one to see if the
 extra factors are needed - but I was unable to find any help on how to do
 that.
 
 Could you help me and tell me what to do? Or tell me where to find help on
 this topic?
 
  
 
 Many thanks in advance!
 
  
 
 Lars
 
 _
 
 s. It's easy!
 
 aspxmkt=en-us
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/How-to-perform-a-Likelihood-ratio-test--tp24460221p24463269.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] graph: axis label font

2009-07-13 Thread serbring

Hi, 

excuse me for my english, i am using R on windows and i have to do several
graphs with axis labels and the axis text thicks has a specified font type,
(Arial) and a specified font size. How can i do these? Thank you in advance

-- 
View this message in context: 
http://www.nabble.com/graph%3A-axis-label-font-tp24463974p24463974.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting a timeseries element into a string

2009-07-13 Thread tradenet

Thanks Ron.

I apologize for not properly specifying which class/package I was using.  I
am using the timeSeries class from the timeSeries Package, version 2100.83
from rmetrics.org

The code you supplied works for me! -- thanks again.

Warm regards,

Andrew


--

Ron Burns-2 wrote:
 
 Here is what I do:
 
 R library(fBasics)
 R ts-dummyDailySeries(x = rnorm(365), units = NULL, zone = , 
 FinCenter =)
 R class(ts)
 [1] timeSeries
 attr(,package)
 [1] timeSeries
 R (s - as.character(time(ts)[1]))
 [1] 1970-01-01
 R class(s)
 [1] character
 
 
 
 
 
 
 tradenet wrote:
 I added a reproducible example to my question...

 ts-dummyDailySeries(x = rnorm(365), units = NULL, zone = , FinCenter =
 )

   
 (ts[1,0])  #returns first date in return series
 
 GMT

 1970-01-01

   
 ttt-(sprintf(%s,ts[1,0]))
 

   
 print(ttt)
 
 character(0)

   
 ttt-(ts[1,0])
 

   
 print(ttt)
 
 GMT

 1970-01-01

 what I want to get is a string containing 1970-01-01

 Thank you.







 tradenet wrote:
   
 I have a timeseries object, ts, and want to get the first date in the
 series into a string so I can concatenate it with a SQL query. Input and
 output are shown below.  I must be missing something very basic, but I
 can't seem to pry the data (2008-07-01) into a string variable.  Any
 suggestions would be appreciated.

 Thank you,

 Andrew

 =script:
 class(ts)
 (ts[1,0])  #returns first date in return series
 ttt-(sprintf(%s,ts[1,0]))
 print(ttt)

 =output:


 
 class(ts)
   
 [1] timeSeries
 attr(,package)
 [1] timeSeries

 
 (ts[1,0])  #returns first date in return series
   
 GMT

 2008-07-01

 
 ttt-(sprintf(%s,ts[1,0]))
   
 print(ttt)
   
 character(0)




 

   
 
 
 -- 
 R. R. Burns
 Physicist (Retired)
 Oceanside, CA
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/getting-a-timeseries-element-into-a-string-tp24429876p24465343.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to perform a Likelihood-ratio test?

2009-07-13 Thread Emmanuel Charpentier
Le lundi 13 juillet 2009 à 13:04 +0200, Lars Bergemann a écrit :
 Hi.
 
  
 
 I would like to use a likelihood-ratio test to compare a linear model (lm()) 
 to another linear model containing the first one to see if the extra factors 
 are needed - but I was unable to find any help on how to do that.
 
 Could you help me and tell me what to do? Or tell me where to find help on 
 this topic?

?anova? Check the test argument ...:-)

Emmanuel Charpentier

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to perform a Likelihood-ratio test?

2009-07-13 Thread Charles C. Berry

On Mon, 13 Jul 2009, Jeff Xu wrote:



Hi Lars,

Using the definition of Likelihood-ratio test is straightforward,



To which I would add, following the instructions in the _posting guide_ is 
equally straightforward. viz



help.search(loglikelihood)

takes you to

?stats::logLik

Chuck



Jeff

Lars Bergemann wrote:



Hi.



I would like to use a likelihood-ratio test to compare a linear model
(lm()) to another linear model containing the first one to see if the
extra factors are needed - but I was unable to find any help on how to do
that.

Could you help me and tell me what to do? Or tell me where to find help on
this topic?



Many thanks in advance!



Lars

_

s. It's easy!

aspxmkt=en-us
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: 
http://www.nabble.com/How-to-perform-a-Likelihood-ratio-test--tp24460221p24463269.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to perform a Likelihood-ratio test?

2009-07-13 Thread Jorge Ivan Velez
Dear Lars,
One way would be

anova(model1, model2)

See ?anova for more information.

HTH,

Jorge



On Mon, Jul 13, 2009 at 7:04 AM, Lars Bergemann
lars.bergem...@hotmail.comwrote:


 Hi.



 I would like to use a likelihood-ratio test to compare a linear model
 (lm()) to another linear model containing the first one to see if the extra
 factors are needed - but I was unable to find any help on how to do that.

 Could you help me and tell me what to do? Or tell me where to find help on
 this topic?



 Many thanks in advance!



 Lars

 _

 s. It's easy!

 aspxmkt=en-us
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help me get this function to work...

2009-07-13 Thread Chip Maney

I have a function (see below).  This function has one object, ID.  If I run the 
loops by itself using a character value (ie,VFFF1-7), then the loops work 
fine.  However, when I try to insert the character value via the function call, 
it doesn't work. I don't get an error, but the TotalCover.df dataframe does not 
update according to the loop criteria. Any obvious problems that you can see?

Cover Function#

#Define Variables
Quadrats.df-unique(data.df$Quadrat)
TotalCover.df-cbind(0:750/10,0,0,0,0,0,0)
colnames(TotalCover.df)- 
c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare)
Shrub.df-data.df[data.df$Layer==Shrub,]
Tree.df-data.df[data.df$Layer==Tree,]

Cover.fn-function(ID){

ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,]
for (j in 1:length(ShrubCover.df[,Quadrat])){
for (i in 1:751){
if(TotalCover.df[i,Station]=ShrubCover.df[j,Start]  
TotalCover.df[i,Station]= ShrubCover.df[j,Stop]) 
TotalCover.df[i,Shrub]- 1
} 
}
TreeCover.df-Tree.df[Tree.df$Quadrat==ID,]
for (j in 1:length(TreeCover.df[,Quadrat])){
for (i in 1:751){
if(TotalCover.df[i,Station]=TreeCover.df[j,Start]  
TotalCover.df[i,Station]= TreeCover.df[j,Stop]) 
TotalCover.df[i,Tree]- 1 
}
}
}


Cover.fn(VFFF1-7)

_

 right from Hotmail. Check it out.

M_WL_QA_HM_celebrity_photos2_072009cat=celebrity
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read.delim skips first column (why?)

2009-07-13 Thread Giovanni Marco Dall'Olio
Hi people,
I have a text file like this one posted:

snp_id  genechromosome  distance_from_gene_center
positionpop1pop2pop3pop4pop5pop6pop7
rs2129081   RAPT2   3   -129993 upstream  0.439009
1.169210NA  0.2330200.093042NA
-0.902596
rs1202698   RAPT2   3   -128695 upstream  NA
1.815000NA  0.3990791.8142701.382950
NA
rs1163207   RAPT2   3   -128224 upstream  NA  NA
NA  NA  NA  NA  NA
rs1834127   RAPT2   3   -128106 upstream  NA  NA
NA  NA  NA  NA  2.180670
rs2114211   RAPT2   3   -126738 upstream  -0.468279
-1.447620   NA  0.010616-0.414581   NA
0.550447
rs2113151   RAPT2   3   -124620 upstream  -0.897660
-1.971020   NA  -0.920327   -0.764658   NA
0.337127
rs2524130   RAPT2   3   -123029 upstream  -0.109795
-0.004646   -0.412059   1.1167400.667567
-0.924529   0.962841
rs1381318   RAPT2   3   -12818  upstream  -0.911662
-1.791580   NA  -0.945716   -1.239640   NA
0.004876
rs2113319   RAPT2   3   -122028 upstream  -0.911662
-1.738610   NA  -0.945716   -1.240950   NA  -0.005318

When I use read.delim (or any read function) on it, R skips the first
column, and I don' understand why.

For example:
$: R
 data = read.delim('snp_file.txt', head=T, sep='\t')

Now, I would expect data$snp_id to contain snp ids, and data$gene to contain
gene names; but it is not like this:

 data$snp_id
[1] RAPT2 RAPT2 RAPT2 RAPT2 RAPT2 RAPT2 RAPT2 RAPT2 RAPT2
Levels: RAPT2
 data$gene
[1] 3 3 3 3 3 3 3 3 3

 summary(data)
  snp_id   gene chromosome  distance_from_gene_center
 RAPT2:9   Min.   :3   Min.   :-129993   upstream:9
   1st Qu.:3   1st Qu.:-128224
   Median :3   Median :-126738
   Mean   :3   Mean   :-113806
   3rd Qu.:3   3rd Qu.:-123029
   Max.   :3   Max.   : -12818


 data$pop7
[1] NA NA NA NA NA NA NA NA NA


Notice that it did use snp_id as the header for the first column, but it
skips completely al the data from that column, and all the fields are
shifted, so the last column is filled with NA values.

What I am doing wrong? Can it be a problem of my data files? I have tried to
modify them a bit (add new columns, etc..) but it didn't work.

I am running R from an Ubuntu system:
 sessionInfo()
R version 2.9.1 (2009-06-26)
i486-pc-linux-gnu

locale:
LC_CTYPE=it_IT.UTF-8;LC_NUMERIC=C;LC_TIME=it_IT.UTF-8;LC_COLLATE=it_IT.UTF-8;LC_MONETARY=C;LC_MESSAGES=it_IT.UTF-8;LC_PAPER=it_IT.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base




-- 
Giovanni Dall'Olio, phd student
Department of Biologia Evolutiva at CEXS-UPF (Barcelona, Spain)

My blog on bioinformatics: http://bioinfoblog.it

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graph: axis label font

2009-07-13 Thread Michael Knudsen
On Mon, Jul 13, 2009 at 5:31 PM, serbringbracard...@email.it wrote:

 excuse me for my english, i am using R on windows and i have to do several
 graphs with axis labels and the axis text thicks has a specified font type,
 (Arial) and a specified font size. How can i do these? Thank you in advance

Interesting question, I didn't know the answer to, so I tried to look
it up. There might be some help towards the bottom of this page:

http://www.statmethods.net/advgraphs/parameters.html

It seems to be specific for Windows, so I can't test it myself.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help get this simple function to work...

2009-07-13 Thread chipmaney

I have a function (see below).  This function has one object, ID.  If I run
the loops by themselves using a character value (ie,VFFF1-7) instead of
the function object, then the loops work fine.  However, when I try to
insert the character value via the function call, it doesn't work. I don't
get an error, but the TotalCover.df dataframe does not update according to
the loop criteria. Any obvious problems that you can see?
 
Cover Function#
 
#Define Variables
Quadrats.df-unique(data.df$Quadrat)
TotalCover.df-cbind(0:750/10,0,0,0,0,0,0)
colnames(TotalCover.df)-
c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare)
Shrub.df-data.df[data.df$Layer==Shrub,]
Tree.df-data.df[data.df$Layer==Tree,]
 
Cover.fn-function(ID){
 
ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,]
for (j in 1:length(ShrubCover.df[,Quadrat])){
for (i in 1:751){
if(TotalCover.df[i,Station]=ShrubCover.df[j,Start]
 TotalCover.df[i,Station]= ShrubCover.df[j,Stop]) 
TotalCover.df[i,Shrub]- 1
} 
}
TreeCover.df-Tree.df[Tree.df$Quadrat==ID,]
for (j in 1:length(TreeCover.df[,Quadrat])){
for (i in 1:751){
if(TotalCover.df[i,Station]=TreeCover.df[j,Start]
 TotalCover.df[i,Station]= TreeCover.df[j,Stop]) 
TotalCover.df[i,Tree]- 1 
}
}
}
 
 
Cover.fn(VFFF1-7)
-- 
View this message in context: 
http://www.nabble.com/Help-get-this-simple-function-to-work...-tp24466533p24466533.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survSplit with data.frame containing a Surv object

2009-07-13 Thread Charles C. Berry

On Mon, 13 Jul 2009, Heinz Tuechler wrote:


Dear All,

since years I am struggling with Surv objects in data.frames. The following 
seems to have to do with it.
See below the modified example from the help page of survSplit. The original 
works, as expected. If, however, a Surv object is added to the data.frame, 
each record gets doubled.

Is there some solution other than avoiding Surv objects in data.frames?


I think you can modify survSplit so that it will properly handle Surv 
objects.


Change this line:

newdata - lapply(data, rep, ntimes + 1)

to this:

newdata - lapply(data,
function(x) {
x - as.matrix(x)
x[rep(1:nrow(x), ntimes + 1),]
})

or something similar that results Surv objects being rep()'ed rowwise 
rather than elementwise and returned as objects of the right 
dimension (rather than as a vector).


Caveat: This works in the example you give, but I've not tested this 
extensively.


HTH,

Chuck





Thanks,
Heinz


require(survival)

## from the help page
aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start,
  event=status,episode=i)

summary(aml)
summary(aml3)

coxph(Surv(time,status)~x,data=aml)
## the same
coxph(Surv(start,time,status)~x,data=aml3)

## added to show doubling of records
aml.so - aml
aml.so$surv.object - with(aml, Surv(time, status))

aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start,
event=status,episode=i)
summary(aml3.so)

sessionInfo('survival')
R version 2.9.1 Patched (2009-07-07 r48910)
i386-pc-mingw32

locale:
LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252

attached base packages:
character(0)

other attached packages:
[1] survival_2.35-4

loaded via a namespace (and not attached):
[1] base_2.9.1  graphics_2.9.1  grDevices_2.9.1 methods_2.9.1
[5] splines_2.9.1   stats_2.9.1 utils_2.9.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help me get this function to work...

2009-07-13 Thread David Winsemius
In R a function only returns the last evaluation, so you need to wrap  
up all of the local results into a list at the end of the function.



On Jul 13, 2009, at 1:23 PM, Chip Maney wrote:



I have a function (see below).  This function has one object, ID.   
If I run the loops by itself using a character value (ie,VFFF1-7),  
then the loops work fine.  However, when I try to insert the  
character value via the function call, it doesn't work. I don't get  
an error, but the TotalCover.df dataframe does not update according  
to the loop criteria. Any obvious problems that you can see?


Cover Function#

#Define Variables
Quadrats.df-unique(data.df$Quadrat)
TotalCover.df-cbind(0:750/10,0,0,0,0,0,0)
   colnames(TotalCover.df)-  
c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare)

Shrub.df-data.df[data.df$Layer==Shrub,]
Tree.df-data.df[data.df$Layer==Tree,]

Cover.fn-function(ID){

   ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,]
   for (j in 1:length(ShrubCover.df[,Quadrat])){
   for (i in 1:751){
   if 
(TotalCover.df[i,Station]=ShrubCover.df[j,Start]   
TotalCover.df[i,Station]= ShrubCover.df[j,Stop])

   TotalCover.df[i,Shrub]- 1
   }
   }
   TreeCover.df-Tree.df[Tree.df$Quadrat==ID,]
   for (j in 1:length(TreeCover.df[,Quadrat])){
   for (i in 1:751){
   if 
(TotalCover.df[i,Station]=TreeCover.df[j,Start]   
TotalCover.df[i,Station]= TreeCover.df[j,Stop])

   TotalCover.df[i,Tree]- 1
   }
   }

# Perhaps something like:

list(ShrubCover.df, TreeCover.df, TotalCover.df)


}


Cover.fn(VFFF1-7)

_

right from Hotmail. Check it out.

M_WL_QA_HM_celebrity_photos2_072009cat=celebrity
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to run a R program with input arguments

2009-07-13 Thread edisonying

I am a beginner in R and know only a little about it yet. I have a script
written in R language, named as a.txt for example. I am using a Linux
machine, at present I only know that I can type R in the terminal and then
copy-paste the content in a.txt to the R's interface to execute the
program. However, I want to know if there is any method that allows me to
add some input arguments directly after the R in the terminal.
Specifically, I want to type the following in the cmd line:

   R (some flags or option) a.txt

then the program will begin to run. Besides, if the program will read a data
file first, can I also specify the data file in the command line? Then the
complete command will become:

   R  a.txt  data.txt

This is important for a beginner. Thanks very much!:working:
   
-- 
View this message in context: 
http://www.nabble.com/how-to-run-a-R-program-with-input-arguments-tp24465852p24465852.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] object .trPaths not found

2009-07-13 Thread Thomas Petzoldt

Hi,

using a variable named .trPaths is a feature and a frequent nuisance of 
recent Tinn-R and also explained in the Tinn-R docs.


Solution 1:

In Tinn-R select the menu:

 R -- Configure -- and then Temporarily or Permanent

Solution 2:

Manually edit file R-installation-directory/etc/Rconfig.site

The minimal line to add is:

.trPaths - paste(paste(Sys.getenv('APPDATA'), '\\Tinn-R\\tmp\\', 
sep=''), c('', 'search.txt', 'objects.txt', 'file.r', 'selection.r', 
'block.r', 'lines.r'), sep='')



Hope it helps

Thomas P.


Knut Krueger schrieb:

Uwe Ligges schrieb:




I have to admit that I have no idea what we are talking about here 
(yes, I tend to forget many things these days) - and you have not 
cited the original message, unfortunately (nor have you specifies R 
versions, Tinn-R versions and both OS versions, but just one) ...


Best wishes,
Uwe

Original question
rkevinbur...@charter.net wrote:
I am running an R script with Tinn-R (2.2.0.1) and I get the error 
message

 
Error in source(.trPaths[4], echo = TRUE, max.deparse.length = 150) : 
  object .trPaths not found


Any solutions?

System

R 2.8.1
Tinn-R 2.2.02
windows Xp professional Servicepack 3

This combination works on my Pc without problems
And the code is working on the other system with Tinn-R problems
with the R-Editor without problems
so it seems to be a Tinn_R problem

but maybe anybody has a solution for that.

Regards Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help me get this function to work...

2009-07-13 Thread Mark Knecht
On Mon, Jul 13, 2009 at 12:08 PM, David Winsemiusdwinsem...@comcast.net wrote:
 In R a function only returns the last evaluation, so you need to wrap up all
 of the local results into a list at the end of the function.



SNIP

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT


How important it is to wrap the list in a return statement, ala

return(list(ShrubCover.df, TreeCover.df, TotalCover.df))

or

answer - list(ShrubCover.df, TreeCover.df, TotalCover.df)
return(answer)

Thanks,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to run a R program with input arguments

2009-07-13 Thread Henrique Dallazuanna
See commandArgs() function.

On Mon, Jul 13, 2009 at 4:15 PM, edisonying edisonying1...@gmail.comwrote:


 I am a beginner in R and know only a little about it yet. I have a script
 written in R language, named as a.txt for example. I am using a Linux
 machine, at present I only know that I can type R in the terminal and
 then
 copy-paste the content in a.txt to the R's interface to execute the
 program. However, I want to know if there is any method that allows me to
 add some input arguments directly after the R in the terminal.
 Specifically, I want to type the following in the cmd line:

   R (some flags or option) a.txt

 then the program will begin to run. Besides, if the program will read a
 data
 file first, can I also specify the data file in the command line? Then the
 complete command will become:

   R  a.txt  data.txt

 This is important for a beginner. Thanks very much!:working:

 --
 View this message in context:
 http://www.nabble.com/how-to-run-a-R-program-with-input-arguments-tp24465852p24465852.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] \dQuote in packages

2009-07-13 Thread Rebecca Sela
Thank you!  (That was easy to fix.)

How does one deal with quoting (in \reference)?  The following line causes 
problems:
\references{Sela, Rebecca J., and Simonoff, Jeffrey S., \dQuote{RE-EM Trees: A 
New Data Mining Approach for Longitudinal Data}.}
The error given is:
Warning in parse_Rd(./man/predict.Rd, encoding = unknown) :
  ./man/predict.Rd:28: unknown macro '\dquote'
*** error on file ./man/predict.Rd
Error : ./man/predict.Rd:28: Unrecognized macro \dquote

The manual for writing R packages said I should not just use the character .  
What should I be using here?

Thanks again!

Rebecca



- Original Message -
From: Uwe Ligges lig...@statistik.tu-dortmund.de
To: Rebecca Sela rs...@stern.nyu.edu
Cc: r-help r-help@r-project.org
Sent: Friday, July 10, 2009 8:17:44 PM GMT -05:00 US/Canada Eastern
Subject: Re: [R] \dQuote in packages



Rebecca Sela wrote:
 Here is one Rd file with problems, now inline so that it can be read:
 
 \name{simpleREEMdata}
 \docType{data}
 \alias{simpleREEMdata}
 \title{Sample Data for RE-EM trees}
 \description{
 This data set is consists of a panel of 50 individuals with 12 observations 
 per individual.  The data is based on a regression tree with an initial split 
 based on a dummy variable (\code{D}) and a second split based on time in the 
 branch where \code{D=1}.  The observations include both randomly generated 
 individual-specific effects and observation-specific errors.
 }
 \format{
 The data has 600 rows and 5 columns.  The columns are: 

insert here:

\itemize{

 \item{\code{Y}}{the target variable}
 \item{\code{t}}{a numeric predictor (time)}
 \item{\code{D}}{a catergorical predictor with two levels, 0 and 1}
 \item{\code{ID}}{the identifier for each individual}
 \item{\code{X}}{another covariate (which is intentionally unrelated to the 
 target variable)}

insert here:

}


or in other words, you need an itemize environment in order to use \item 
within \format, see the manual Writing R Extensions.

Best,
Uwe


 }
 \references{Sela, Rebecca J., and Simonoff, Jeffrey S., \dQuote{RE-EM Trees: 
 A New Data Mining Approach for Longitudinal Data}.}
 \keyword{datasets}
 
 Thanks again for your help!
 
 Rebecca
 
 - Original Message -
 From: Uwe Ligges lig...@statistik.tu-dortmund.de
 To: Rebecca Sela rs...@stern.nyu.edu
 Cc: r-help r-help@r-project.org
 Sent: Thursday, July 9, 2009 6:05:46 AM GMT -05:00 US/Canada Eastern
 Subject: Re: [R] \dQuote in packages
 
 Rebecca,
 
 the attachments have been stripped off by the mailing list.
 
 
 Rebecca Sela wrote:
 That's good to know.  I have attached three Rd files that gave errors 
 (others gave identical errors).  I would love to know what is wrong with 
 them.

 I'm using 2.1.1 because that is what is installed on the Linux computer I 
 have access to.  (I haven't bothered figuring out how to assemble a package 
 in Windows.)
 
 You should *really* upgrade! That version is outdated for several years now.
 
 How to do it on Windows: See the R Installation and Administration 
 manual with its corresponding section.
 
 Best,
 Uwe
 
 
 Thank you for your help!

 Rebecca


 - Original Message -
 From: Uwe Ligges lig...@statistik.tu-dortmund.de
 To: Rebecca Sela rs...@stern.nyu.edu
 Cc: r-help r-help@r-project.org
 Sent: Wednesday, July 8, 2009 6:11:33 PM GMT -05:00 US/Canada Eastern
 Subject: Re: [R] \dQuote in packages

 The difference you are experiencing is the new Rd2 parser that is more 
 picky now (but also prevents to produce wrong documentation files).

 If you make the code of the Rd available, someone might be able to help.

 Are you really under R-2.1.1 ??? That is really ancient!


 Best,
 Uwe Ligges



 Rebecca Sela wrote:
 I am in the process of submitting a package to CRAN.  R CMD check ran 
 successfully on the package on my local computer, using R version 2.1.1.  
 However, on the computers for CRAN (with version 2.10.0), the following 
 errors occurred:

 Warning in parse_Rd(./man/predict.Rd, encoding = unknown) :
   ./man/predict.Rd:28: unknown macro '\dquote'
 *** error on file ./man/predict.Rd
 Error : ./man/predict.Rd:28: Unrecognized macro \dquote
 Warning in parse_Rd(./man/print.Rd, encoding = unknown) :
   ./man/print.Rd:17: unexpected UNKNOWN '\sideeffects'
 Warning in parse_Rd(./man/simpleREEMdata.Rd, encoding = unknown) :
   ./man/simpleREEMdata.Rd:10: unknown macro '\item'

 Are \dquote, \sideeffects, and \item not supported in newer versions of R?  
 Is there some underlying problem that I should fix that makes these show up?

 Thank you very much.

 Rebecca

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 

 __
 

[R] .RData signature

2009-07-13 Thread stenka1

   Is there a way to identify the beginning and end of an .RData file?
   I am trying to restore lost files on opensolaris where the table of contents
   is overwritten and will be reading the raw disk byte by byte.
   I looked at several files in hex and the end seems different, while most of
   them, but not all begin with RDX2.X
   Thank you.
   Stephen C. Bond
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error: optim(rho, n2ll.rho, method = method, control = control, beta = parm$beta, : initial value in 'vmmin' is not finite

2009-07-13 Thread Ravi Varadhan
Hi Brandy,

The `vmmin' refers to a variable metric algorithm, which is a quasi-Newton
method for optimization.  This algorithm is used when `method=BFGS' in the
optim() call.  The quasi-Newton methods iteratively build an approopriate
Hessian matrix, which is of dimension p x p, where p is the problem size.
In your case the hessian matrix is 4000 x 4000, which BFGS isi unable to
handle.  

If the code for the function lnam() and dependent functions is entirely in
R, you can try a different algorithm than BFGS.  For example, you can try
CG, which can handle high-dimensional optimization.  However, before doing
that I would make sure that CG works well and gives same answer as BFGS
on smaller problems.  If CG doesn't cut it, then I would try the spg()
function in the BB package.

If the code for lnam() is not in R, then I would contact the package author
to help you out with incorporating other optimizers for your
high-dimensional problem.

Hope this helps,
Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Brandy Lee Aven
Sent: Thursday, July 09, 2009 11:11 PM
To: r-help@r-project.org
Subject: [R] error: optim(rho, n2ll.rho, method = method, control = control,
beta = parm$beta, : initial value in 'vmmin' is not finite

I am trying to use the lnam autocorrelation model from the SNA package. I
have it running for smaller adjacency matrices (1,500) it works just fine
but when my matrices are bigger 4000+. I get the error: 

 lnam1_01.adj- lnam(data01$adopt,x01,ec2001.csr)
Error in optim(rho, n2ll.rho, method = method, control = control, beta =
parm$beta,  : 
  initial value in 'vmmin' is not finite


I have looked at the lnam code and cant even figure out what vmmin is. 
Is there anyway around this? Am I doing something wrong? What makes me think
that its about the size of the adjacency matrix is that I can run the same
command on similar objects that are just smaller and it works fine. 

please help!

sessionInfo()
R version 2.9.1 (2009-06-26)
x86_64-pc-linux-gnu 

locale:
C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] numDeriv_2009.2-1 sna_2.0-1

loaded via a namespace (and not attached):
[1] tools_2.9.1

 class(data01$adopt) #This is the response vector y
[1] integer

data01$adopt[1:10] # Its just a binary outcome for all vertices
[1] 0 0 0 0 0 0 0 0 0 0 ..until 4,003

 class(x01) #X01 is a matrix of my six covariates for all vertices

[1] matrix

#here is the an example of the data
x01[1:10,1:6]
on01 indegree outdegree between eigen numalters01
1   1   0   0   0   1   1
19  1   0   1   0   0   1
123 1   0   1   0   0   1
140 1   0   1   0   0   1
169 1   0   1   0   0   1
189 1   0   1   0   0   1
195 1   0   1   0   0   2
204 1   0   1   0   0   1
231 1   0   2   0   0   1
252 1   0   3   0   0   4

# this is the adjacency matrix (in Sparse matrix format) that causes the
error. I have another that is 10,500 and does the same thing.
dim(ec2001.csr)
[1] 4003 4003

class(ec2001.csr)
[1] matrix.csr
attr(,package)
[1] SparseM

ec2001.csr[1:10,1:10] #here is what it looks like
1 19 123 140 169 189 195 204 231 252
1   1  0   0   0   0   0   0   0   0   0
19  0  0   0   1   0   0   0   0   0   0
123 0  0   0   0   0   0   0   0   0   0
140 0  0   0   0   0   0   0   0   0   0
169 0  0   0   0   0   0   0   0   0   0
189 0  0   1   0   0   0   0   0   0   0
195 0  0   0   0   0   0   0   0   0   0
204 0  0   0   0   0   0   0   0   0   0
231 0  0   0   0   0   0   0   0   0   0
252 0  0   0   0   0   0   0   0   0   0


#There are also no infinite values in the objects. 
is.infinite(x01)
[1] FALSE . N

is.infinite(data01$adopt)
[1] FALSE .N

is.infinite(ec2001.csr)
[1] FALSE

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

[R] regression with replication

2009-07-13 Thread Agus Susanto
Dear all,
I would like to fit a linear regression with replication (on each year,
observation is replicated, e.g 4 times). The independent variable ranges
for instance 1-5 year, so I expect to have a linear fit of 5 points.
For that purpose I do these (with dummy variables x and y):

x-rep(seq(1:5),4)
y-rnorm(20)
linreg-lm(y~x)
fitted.values(linreg)  # why produce 20 points of estimate?
predict(linreg)# why produce 20 points of estimate?

Please somebody explain:
1. why both fitted.values and predict functions produced 20 points of
estimate, NOT 5 points.
2. is lm(y~x) correct to solve this regression case, or there's a
correct procedure.

Many thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help me get this function to work...

2009-07-13 Thread Bert Gunter
How important it is to wrap the list in a return statement, ala

return(list(ShrubCover.df, TreeCover.df, TotalCover.df))

or

answer - list(ShrubCover.df, TreeCover.df, TotalCover.df)
return(answer)
---

Completely Un.

Consult the R Docs, especially the R Language Definition manual, for answers
to this and your other numerous basic questions. That's what they're there
for.

-- Bert Gunter
Genentech, Inc.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] testing equality of two dependent correlations + normality issue

2009-07-13 Thread Jaroslav Hlinka

Hi, I am turning to you with a (hopefully simple?) stats question. I would
like to test equality of two correlation coefficients in a setting with
three variables X,Y,Z, i.e. equality of r(X,Y) and r(Z,Y). I have found a
formula to transform the 2 dependent correlations difference to
t-distribution with N-3 df:
t = (rxy - rzy)* SQRT[{(n - 3)(1 + rxz)}/ {2(1 - rxy^2 - rxz^2 - rzy^2 +
2rxy*rxz*rzy)}]
(Blalock, H., 1972. Social Statistics. NY: McGraw-Hill. Page 406-7). Am
actually not sure whether this is exact or approximate (even given normality
assumption, the Fisher's Z-transform which this is - I assume - based on, is
approximate, right?).
But to make it a bit more complicated, Shapiro-Wilks test of normality gives
p=0.022 for variable X. Therefore assuming normality may not be safe
(justifiable) at all? What do I do then? Do I report this test as
assymptotically valid, or do I run some other test?

Any ideas? Many thanks in advance,
Jaroslav

-- 
View this message in context: 
http://www.nabble.com/testing-equality-of-two-dependent-correlations-%2B-normality-issue-tp24465768p24465768.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] re ference

2009-07-13 Thread Derek Lacoursiere

Hello R users,

I have developed code in R that generates populations of non-overlapping
random polygons (polygonal maps) for testing certain stereological
estimation methods.  I have tried to look for articles or papers online on
the generation of random polygonal maps and have found very little.  There
seems to be a much greater concern with generating single random polygons
rather than whole maps of random polygons.  Does anyone have an idea of
where I should look or what keywords to use in my search?

I really appreciate your patience with R novices like myself.
-- 
View this message in context: 
http://www.nabble.com/reference-tp24467408p24467408.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reducing arrays for comparison with each other...

2009-07-13 Thread Mark Knecht
Hi,
   First up, thanks again for all the help I'm getting on this list.
I'm making great headway in analyzing my experimental data on an
experiment by experiment basis. No way I could have done this in the
time I've done it without your help.

   This email is partially a question about R but is also soliciting
overall guidance if anyone wants to give it. I somehow managed to get
an electrical engineering degree decades ago from the University of
California system without ever being required to take a statistics
class so in that area I'm lacking background. Maybe there's a good
subject someone can point me toward. (Keep it simple though!)

   OK, I've now got dozens of experiments sitting in individual
data.frames. In most cases the data.frames look more or less like the
example below - Experiment Start time down the left side, Experiment
Completion time across the top and results in the cells. The following
data started at 830 and is 3 minute data so the first measurement came
at 833, then 836, etc.

EnTime  836  842   845  848   851  854  900   903
1  833 -3860  -772 -938  -7720 -386
2  8360 -386 00 0 -246  714  -632
3  83900 00  -38600  -772
4  84200  -3860 00 -682 0
5  84500 00 000 0
6  84800 00  -38600 0
7  85100 00 000 0
8  85400 00 00 -386 0
9  85700 00 000 0
10 90000 00 000 0
11 90300 00 00

   The issue I'm thinking about now is how to compare experiments when
the time increment for each is different? This one is 3 minutes,
others are 1 minute, 5 minutes, 13 minutes, etc. My initial thought is
to try and roll things up into common sizes, like 30 minute or 60
minute chunks. Granted, I won't get the same number of measurements in
each chunk, but it's a start and would yield a data.frame that's easy
to view in the Rgui console so I'd be happy with that, but is that a
good method in terms of the statistics of the problem? should I be
paying attention to the number of experiments, maybe as a percentage
of the whole, or something else? I don't know.

   Anyway, I'd like to try rolling this data up by time increment to
start and I'm wondering if there any easy ways to do that? Taking 15
minute increments I would like to get something like

   EnTime845  900 915
1  845  SUM SUM SUM
2  900  SUM SUM SUM
4  915  SUM SUM SUM

where the SUM is the sum of the pieces making up this new cell.

I tried using subset but apparently it doesn't work on arrays like
this, telling me that EnTime=830 isn't valid for factors. I can
certainly find ways to munge the data set, changing EnTime values but
that seems like stuff I shouldn'y be doing. that said it's pretty easy
to do I suspect. (Modulo 15, etc.)

   Is there a way to use cast() to grab time increments. Instead of
cast(Results, EnTime ~ ExTime, ...) returning by the exact value, can
cast collect values together in groups?

   Any ideas appreciated before I start banging away at doing my less
interesting methods.

Thanks,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help me get this function to work...

2009-07-13 Thread Duncan Murdoch

On 7/13/2009 3:21 PM, Mark Knecht wrote:

On Mon, Jul 13, 2009 at 12:08 PM, David Winsemiusdwinsem...@comcast.net wrote:

In R a function only returns the last evaluation, so you need to wrap up all
of the local results into a list at the end of the function.




SNIP


David Winsemius, MD
Heritage Laboratories
West Hartford, CT



How important it is to wrap the list in a return statement, ala

return(list(ShrubCover.df, TreeCover.df, TotalCover.df))

or

answer - list(ShrubCover.df, TreeCover.df, TotalCover.df)
return(answer)


Those are almost identical.

The only difference is that the second version will leave the local 
variable answer in the evaluation frame.  For most functions the 
evaluation frame disappears after the return, but sometimes it lives a 
while longer.


For example, you can return it explicitly by putting environment() as 
one of the return values.  More commonly it is returned implicitly as 
the environment of a function created during evaluation, e.g.


 buildf - function() {
+   f - function(x) x + 1
+   return(f)
+ }
 g - buildf()
 g(5)
[1] 6
 ls(environment(g))
[1] f

The last command looks in the environment of g, and sees the local 
variable f there, of which g is a copy.  If you had done


answer - f
return(answer)

you'd also see answer there.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SweaveListingUtils question

2009-07-13 Thread Karsten Weinert
Hello,
recently I read about the SweaveListingUtils package and now I want to try
it out. However, I can not make it work...

Below a minimal example. The problem seems to be the following line
(generated by SweaveListingPreparations()?):

\ifthenelse{\boolean{swe...@gin}}{\setkeys{gin}{width=0.6\textwidth}}{}%

If I comment out this line, it works.

What can I do about this?

I am using the \begin{Scode} notation instead of Rnw files. May this be the
problem? For some technical reason, I would like to stick on this notation.

Any help appreciated,

kind regards,
Karsten.

 begin example

\documentclass[9pt]{beamer}

\usepackage{fancyvrb}
\usepackage{listings}

% choose language and char set
\usepackage[ngerman]{babel}
\usepackage[ansinew]{inputenc}
%

\SweaveOpts{prefix.string=Routput/parcel, keep.source=TRUE}

\begin{Scode}{results=tex, echo=FALSE}
require(SweaveListingUtils)
SweaveListingoptions(intermediate = FALSE)
SweaveListingPreparations()
\end{Scode}


\begin{document}


\begin{frame}[fragile]
\frametitle{Your title}

\begin{lstlisting}options(digits=3)
set.seed(1)

require(Hmisc)

age - rnorm(1000,50,10)
sex - sample(c('female','male'),1000,TRUE)
out - histbackback(split(age, sex), probability=TRUE, xlim=c(-.06,.06),
main = 'Back to Back Histogram')

#! just adding color
barplot(-out$left, col=red , horiz=TRUE, space=0, add=TRUE, axes=FALSE)
barplot(out$right, col=blue, horiz=TRUE, space=0, add=TRUE, axes=FALSE)
\end{lstlisting}
\end{frame}

\begin{Scode}{echo=FALSE}
  unloadNamespace(SweaveListingUtils)
\end{Scode}

\end{document}

% end example

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] .RData signature

2009-07-13 Thread Duncan Murdoch

On 7/13/2009 3:38 PM, sten...@go.com wrote:

   Is there a way to identify the beginning and end of an .RData file?
   I am trying to restore lost files on opensolaris where the table of contents
   is overwritten and will be reading the raw disk byte by byte.
   I looked at several files in hex and the end seems different, while most of
   them, but not all begin with RDX2.X


No, there is nothing special written at the end of a file.  Many .RData 
files are compressed (by internal gzip), so the RDX2 header is hidden, 
and you'll see bytes 1F 8B instead.  (I don't think we write the trailer 
bytes on the gzip stream, but I'm not sure about that.)


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] \dQuote in packages

2009-07-13 Thread Duncan Murdoch

On 7/13/2009 3:27 PM, Rebecca Sela wrote:

Thank you!  (That was easy to fix.)

How does one deal with quoting (in \reference)?  The following line causes 
problems:
\references{Sela, Rebecca J., and Simonoff, Jeffrey S., \dQuote{RE-EM Trees: A 
New Data Mining Approach for Longitudinal Data}.}
The error given is:
Warning in parse_Rd(./man/predict.Rd, encoding = unknown) :
  ./man/predict.Rd:28: unknown macro '\dquote'
*** error on file ./man/predict.Rd
Error : ./man/predict.Rd:28: Unrecognized macro \dquote

The manual for writing R packages said I should not just use the character .  
What should I be using here?


Are you sure you showed us the right line?  The error message says 
\dquote, but the line contains \dQuote.  \dquote is wrong, \dQuote 
is fine.


Duncan Murdoch



Thanks again!

Rebecca



- Original Message -
From: Uwe Ligges lig...@statistik.tu-dortmund.de
To: Rebecca Sela rs...@stern.nyu.edu
Cc: r-help r-help@r-project.org
Sent: Friday, July 10, 2009 8:17:44 PM GMT -05:00 US/Canada Eastern
Subject: Re: [R] \dQuote in packages



Rebecca Sela wrote:

Here is one Rd file with problems, now inline so that it can be read:

\name{simpleREEMdata}
\docType{data}
\alias{simpleREEMdata}
\title{Sample Data for RE-EM trees}
\description{
This data set is consists of a panel of 50 individuals with 12 observations per 
individual.  The data is based on a regression tree with an initial split based 
on a dummy variable (\code{D}) and a second split based on time in the branch 
where \code{D=1}.  The observations include both randomly generated 
individual-specific effects and observation-specific errors.
}
\format{
The data has 600 rows and 5 columns.  The columns are: 


insert here:

\itemize{


\item{\code{Y}}{the target variable}
\item{\code{t}}{a numeric predictor (time)}
\item{\code{D}}{a catergorical predictor with two levels, 0 and 1}
\item{\code{ID}}{the identifier for each individual}
\item{\code{X}}{another covariate (which is intentionally unrelated to the 
target variable)}


insert here:

}


or in other words, you need an itemize environment in order to use \item 
within \format, see the manual Writing R Extensions.


Best,
Uwe



}
\references{Sela, Rebecca J., and Simonoff, Jeffrey S., \dQuote{RE-EM Trees: A 
New Data Mining Approach for Longitudinal Data}.}
\keyword{datasets}

Thanks again for your help!

Rebecca

- Original Message -
From: Uwe Ligges lig...@statistik.tu-dortmund.de
To: Rebecca Sela rs...@stern.nyu.edu
Cc: r-help r-help@r-project.org
Sent: Thursday, July 9, 2009 6:05:46 AM GMT -05:00 US/Canada Eastern
Subject: Re: [R] \dQuote in packages

Rebecca,

the attachments have been stripped off by the mailing list.


Rebecca Sela wrote:

That's good to know.  I have attached three Rd files that gave errors (others 
gave identical errors).  I would love to know what is wrong with them.

I'm using 2.1.1 because that is what is installed on the Linux computer I have 
access to.  (I haven't bothered figuring out how to assemble a package in 
Windows.)


You should *really* upgrade! That version is outdated for several years now.

How to do it on Windows: See the R Installation and Administration 
manual with its corresponding section.


Best,
Uwe



Thank you for your help!

Rebecca


- Original Message -
From: Uwe Ligges lig...@statistik.tu-dortmund.de
To: Rebecca Sela rs...@stern.nyu.edu
Cc: r-help r-help@r-project.org
Sent: Wednesday, July 8, 2009 6:11:33 PM GMT -05:00 US/Canada Eastern
Subject: Re: [R] \dQuote in packages

The difference you are experiencing is the new Rd2 parser that is more 
picky now (but also prevents to produce wrong documentation files).


If you make the code of the Rd available, someone might be able to help.

Are you really under R-2.1.1 ??? That is really ancient!


Best,
Uwe Ligges



Rebecca Sela wrote:

I am in the process of submitting a package to CRAN.  R CMD check ran 
successfully on the package on my local computer, using R version 2.1.1.  
However, on the computers for CRAN (with version 2.10.0), the following errors 
occurred:

Warning in parse_Rd(./man/predict.Rd, encoding = unknown) :
  ./man/predict.Rd:28: unknown macro '\dquote'
*** error on file ./man/predict.Rd
Error : ./man/predict.Rd:28: Unrecognized macro \dquote
Warning in parse_Rd(./man/print.Rd, encoding = unknown) :
  ./man/print.Rd:17: unexpected UNKNOWN '\sideeffects'
Warning in parse_Rd(./man/simpleREEMdata.Rd, encoding = unknown) :
  ./man/simpleREEMdata.Rd:10: unknown macro '\item'

Are \dquote, \sideeffects, and \item not supported in newer versions of R?  Is 
there some underlying problem that I should fix that makes these show up?

Thank you very much.

Rebecca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] object .trPaths not found

2009-07-13 Thread Knut Krueger

Thomas Petzoldt schrieb:


Hi,

using a variable named .trPaths is a feature and a frequent nuisance 
of recent Tinn-R and also explained in the Tinn-R docs.


Solution 1:

In Tinn-R select the menu:

 R -- Configure -- and then Temporarily or Permanent

Solution 2:

Manually edit file R-installation-directory/etc/Rconfig.site

The minimal line to add is:

.trPaths - paste(paste(Sys.getenv('APPDATA'), '\\Tinn-R\\tmp\\', 
sep=''), c('', 'search.txt', 'objects.txt', 'file.r', 'selection.r', 
'block.r', 'lines.r'), sep='')



Hope it helps


no for me - I do not switched to permanent and i did not have the 
Rconfig.site file edited but it is working

So i do not know *why it is working* ;-)

I hope yes for the others (I will forward it) on the computers where it 
is not working


Thank you Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survSplit with data.frame containing a Surv object

2009-07-13 Thread Heinz Tuechler

At 20:18 13.07.2009, Charles C. Berry wrote:

On Mon, 13 Jul 2009, Heinz Tuechler wrote:


Dear All,

since years I am struggling with Surv objects in data.frames. The 
following seems to have to do with it.
See below the modified example from the help page of survSplit. The 
original works, as expected. If, however, a Surv object is added to 
the data.frame, each record gets doubled.

Is there some solution other than avoiding Surv objects in data.frames?


I think you can modify survSplit so that it will properly handle Surv objects.

Change this line:

newdata - lapply(data, rep, ntimes + 1)

to this:

newdata - lapply(data,
function(x) {
x - as.matrix(x)
x[rep(1:nrow(x), ntimes + 1),]
})

or something similar that results Surv objects being rep()'ed 
rowwise rather than elementwise and returned as objects of the right 
dimension (rather than as a vector).


Caveat: This works in the example you give, but I've not tested this 
extensively.


HTH,

Chuck





Thanks,
Heinz


require(survival)

## from the help page
aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start,
  event=status,episode=i)

summary(aml)
summary(aml3)

coxph(Surv(time,status)~x,data=aml)
## the same
coxph(Surv(start,time,status)~x,data=aml3)

## added to show doubling of records
aml.so - aml
aml.so$surv.object - with(aml, Surv(time, status))

aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start,
event=status,episode=i)
summary(aml3.so)

sessionInfo('survival')
R version 2.9.1 Patched (2009-07-07 r48910)
i386-pc-mingw32

locale:
LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252

attached base packages:
character(0)

other attached packages:
[1] survival_2.35-4

loaded via a namespace (and not attached):
[1] base_2.9.1  graphics_2.9.1  grDevices_2.9.1 methods_2.9.1
[5] splines_2.9.1   stats_2.9.1 utils_2.9.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
Dept of 
Family/Preventive Medicine

E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901


Thank you Chuck,

it seems to work also with my real data, but I noted that in the 
example aml$x, which is a factor, gets converted to character in 
aml3.so. Maybe, if I find the time, I should look at 
as.data.frame.matrix and rbind for Surv objects.


Thanks again,
Heinz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help get this simple function to work...

2009-07-13 Thread jim holtman
Your basic problem is that you are trying to change the value of a
global object (TotalCover.df) from within a function.  A better
approach is to pass in the object that you are changing and then
return it as the value of the function.  You can then assign it back
to the original object if you want.  I think something like this
should work, but the code you have is not executable since some
objects are not defined (e.g., data.df):

#Define Variables
Quadrats.df-unique(data.df$Quadrat)
TotalCover.df-cbind(0:750/10,0,0,0,0,0,0)
   colnames(TotalCover.df)-
c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare)
Shrub.df-data.df[data.df$Layer==Shrub,]
Tree.df-data.df[data.df$Layer==Tree,]

Cover.fn-function(ID, TC.df){

   ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,]
   for (j in 1:length(ShrubCover.df[,Quadrat])){
   for (i in 1:751){
   if(TC.df[i,Station]=ShrubCover.df[j,Start]
 TC.df[i,Station]= ShrubCover.df[j,Stop])
   TC.df[i,Shrub]- 1
   }
   }
   TreeCover.df-Tree.df[Tree.df$Quadrat==ID,]
   for (j in 1:length(TreeCover.df[,Quadrat])){
   for (i in 1:751){
   if(TC.df[i,Station]=TreeCover.df[j,Start]
 TC.df[i,Station]= TreeCover.df[j,Stop])
   TC.df[i,Tree]- 1
   }
   }
TC.df   # return value
}


Cover.fn(VFFF1-7, TotalCover.df)

On Mon, Jul 13, 2009 at 2:06 PM, chipmaneychipma...@hotmail.com wrote:

 I have a function (see below).  This function has one object, ID.  If I run
 the loops by themselves using a character value (ie,VFFF1-7) instead of
 the function object, then the loops work fine.  However, when I try to
 insert the character value via the function call, it doesn't work. I don't
 get an error, but the TotalCover.df dataframe does not update according to
 the loop criteria. Any obvious problems that you can see?

 Cover Function#

 #Define Variables
 Quadrats.df-unique(data.df$Quadrat)
 TotalCover.df-cbind(0:750/10,0,0,0,0,0,0)
    colnames(TotalCover.df)-
 c(Station,Shrub,Tree,Invasive,Herb,Litter,Bare)
 Shrub.df-data.df[data.df$Layer==Shrub,]
 Tree.df-data.df[data.df$Layer==Tree,]

 Cover.fn-function(ID){

    ShrubCover.df-Shrub.df[Shrub.df$Quadrat==ID,]
        for (j in 1:length(ShrubCover.df[,Quadrat])){
            for (i in 1:751){
                if    (TotalCover.df[i,Station]=ShrubCover.df[j,Start]
  TotalCover.df[i,Station]= ShrubCover.df[j,Stop])
                    TotalCover.df[i,Shrub]- 1
            }
        }
    TreeCover.df-Tree.df[Tree.df$Quadrat==ID,]
        for (j in 1:length(TreeCover.df[,Quadrat])){
            for (i in 1:751){
                if    (TotalCover.df[i,Station]=TreeCover.df[j,Start]
  TotalCover.df[i,Station]= TreeCover.df[j,Stop])
                    TotalCover.df[i,Tree]- 1
            }
        }
 }


 Cover.fn(VFFF1-7)
 --
 View this message in context: 
 http://www.nabble.com/Help-get-this-simple-function-to-work...-tp24466533p24466533.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re gression with replication

2009-07-13 Thread Ben Bolker



AgusSusanto wrote:
 
 Dear all,
 I would like to fit a linear regression with replication (on each year,
 observation is replicated, e.g 4 times). The independent variable ranges
 for instance 1-5 year, so I expect to have a linear fit of 5 points.
 For that purpose I do these (with dummy variables x and y):
 
 x-rep(seq(1:5),4)
 y-rnorm(20)
 linreg-lm(y~x)
 fitted.values(linreg)  # why produce 20 points of estimate?
 predict(linreg)# why produce 20 points of estimate?
 
 Please somebody explain:
 1. why both fitted.values and predict functions produced 20 points of
 estimate, NOT 5 points.
 
   Because R doesn't know your regression is replicated. It just knows you
 gave it
 20 pairs of predictors and responses.
 
Perhaps you want
 
 predict(linreg,newdata=list(x=unique(x)))   ?
 
 2. is lm(y~x) correct to solve this regression case, or there's a
 correct procedure.
 
 

If you expected the samples within year to be independent of each other
(which seems a little dicey) then this would be correct; otherwise the
coefficients
will be correct but the residual df etc. will be wrong.  You could try
something like

ymeans  - tapply(y,x,mean)
xvals - unique(x)
lm(ymeans ~ xvals)

or a more complicated approach (probably unnecessary here):

library(nlme)

lme(y~x,random=~1|year)

[I think]


-- 
View this message in context: 
http://www.nabble.com/regression-with-replication-tp24468326p24470213.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survSplit with data.frame containing a Surv object

2009-07-13 Thread Charles C. Berry

On Mon, 13 Jul 2009, Heinz Tuechler wrote:


At 20:18 13.07.2009, Charles C. Berry wrote:

On Mon, 13 Jul 2009, Heinz Tuechler wrote:

 Dear All,
 
 since years I am struggling with Surv objects in data.frames. The 
 following seems to have to do with it.
 See below the modified example from the help page of survSplit. The 
 original works, as expected. If, however, a Surv object is added to the 
 data.frame, each record gets doubled.

 Is there some solution other than avoiding Surv objects in data.frames?

I think you can modify survSplit so that it will properly handle Surv 
objects.


Change this line:

 newdata - lapply(data, rep, ntimes + 1)

to this:

 newdata - lapply(data,
 function(x) {
 x - as.matrix(x)
 x[rep(1:nrow(x), ntimes + 1),]
 })

or something similar that results Surv objects being rep()'ed rowwise 
rather than elementwise and returned as objects of the right dimension 
(rather than as a vector).


Caveat: This works in the example you give, but I've not tested this 
extensively.


HTH,

Chuck



 
 Thanks,

 Heinz
 
 
 require(survival)
 
 ## from the help page

 aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start,
event=status,episode=i)
 
 summary(aml)

 summary(aml3)
 
 coxph(Surv(time,status)~x,data=aml)

 ## the same
 coxph(Surv(start,time,status)~x,data=aml3)
 
 ## added to show doubling of records

 aml.so - aml
 aml.so$surv.object - with(aml, Surv(time, status))
 
 aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start,

  event=status,episode=i)
 summary(aml3.so)
 
 sessionInfo('survival')

 R version 2.9.1 Patched (2009-07-07 r48910)
 i386-pc-mingw32
 
 locale:

 
LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252
 
 attached base packages:

 character(0)
 
 other attached packages:

 [1] survival_2.35-4
 
 loaded via a namespace (and not attached):

 [1] base_2.9.1  graphics_2.9.1  grDevices_2.9.1 methods_2.9.1
 [5] splines_2.9.1   stats_2.9.1 utils_2.9.1
 
 __

 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html

 and provide commented, minimal, self-contained, reproducible code.

Charles C. Berry(858) 534-2098
 Dept of 
Family/Preventive Medicine

E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901


Thank you Chuck,

it seems to work also with my real data, but I noted that in the example 
aml$x, which is a factor, gets converted to character in aml3.so. Maybe, if I 
find the time, I should look at as.data.frame.matrix and rbind for Surv 
objects.





Heinz,

Try

newdata - lapply(data, function(x) {
if (!is.matrix(x)) {
rep(x,ntimes +1 )
} else {
x - as.matrix(x)
x[rep(1:nrow(x), ntimes + 1),]}})


HTH,

Chuck


Thanks again,
Heinz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survSplit with data.frame containing a Surv object

2009-07-13 Thread Charles C. Berry

On Mon, 13 Jul 2009, Charles C. Berry wrote:


On Mon, 13 Jul 2009, Heinz Tuechler wrote:


 At 20:18 13.07.2009, Charles C. Berry wrote:
  On Mon, 13 Jul 2009, Heinz Tuechler wrote:
 
   Dear All,
  
   since years I am struggling with Surv objects in data.frames. The 
   following seems to have to do with it.
   See below the modified example from the help page of survSplit. The 
   original works, as expected. If, however, a Surv object is added to 
   the data.frame, each record gets doubled.
   Is there some solution other than avoiding Surv objects in 
   data.frames?
 
  I think you can modify survSplit so that it will properly handle Surv 
  objects.
 
  Change this line:
 
   newdata - lapply(data, rep, ntimes + 1)
 
  to this:
 
   newdata - lapply(data,

   function(x) {
   x - as.matrix(x)
   x[rep(1:nrow(x), ntimes + 1),]
   })
 
  or something similar that results Surv objects being rep()'ed rowwise 
  rather than elementwise and returned as objects of the right dimension 
  (rather than as a vector).
 
  Caveat: This works in the example you give, but I've not tested this 
  extensively.
 
  HTH,
 
  Chuck
 
 
 
  
   Thanks,

   Heinz
  
  
   require(survival)
  
   ## from the help page

   aml3-survSplit(aml,cut=c(5,10,50),end=time,start=start,
  event=status,episode=i)
  
   summary(aml)

   summary(aml3)
  
   coxph(Surv(time,status)~x,data=aml)

   ## the same
   coxph(Surv(start,time,status)~x,data=aml3)
  
   ## added to show doubling of records

   aml.so - aml
   aml.so$surv.object - with(aml, Surv(time, status))
  
   aml3.so - survSplit(aml.so ,cut=c(5,10,50),end=time,start=start,

event=status,episode=i)
   summary(aml3.so)
  
   sessionInfo('survival')

   R version 2.9.1 Patched (2009-07-07 r48910)
   i386-pc-mingw32
  
   locale:

   
LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252
  
   attached base packages:

   character(0)
  
   other attached packages:

   [1] survival_2.35-4
  
   loaded via a namespace (and not attached):

   [1] base_2.9.1  graphics_2.9.1  grDevices_2.9.1 methods_2.9.1
   [5] splines_2.9.1   stats_2.9.1 utils_2.9.1
  
   __

   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html

   and provide commented, minimal, self-contained, reproducible code.
 
  Charles C. Berry(858) 534-2098
   Dept of Family/Preventive 
  Medicine

  E mailto:cbe...@tajo.ucsd.edu   UC San Diego
  http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 
  92093-0901


 Thank you Chuck,

 it seems to work also with my real data, but I noted that in the example
 aml$x, which is a factor, gets converted to character in aml3.so. Maybe,
 if I find the time, I should look at as.data.frame.matrix and rbind for
 Surv objects.




Heinz,

Try

newdata - lapply(data, function(x) {
if (!is.matrix(x)) {
rep(x,ntimes +1 )
 } else {


Oops. The next line is unneeded:


 x - as.matrix(x)



Chuck



 x[rep(1:nrow(x), ntimes + 1),]}})


HTH,

Chuck


 Thanks again,
 Heinz

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
   Dept of Family/Preventive 
Medicine

E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls, reach limit bounds

2009-07-13 Thread Ravi Varadhan
Hi Uyen, 

You do not have enough information to estimate 4 parameters in your
nonlinear model.  Even though you have 12 data points, only 6 are
contributing independent information (you essentially have 6 replicate
points).  If you plot the fittted function you will see that it fits your
data really well.  However, you will also see that the fitted curve is far
from reaching the asymptote.  An implication of this is that you cannot
reliably estimate b0 and b1 separately.  So, you need more data, especially
for larger x values in the asymptotic region.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of spencerg
Sent: Saturday, July 11, 2009 9:50 PM
To: UyenThao Nguyen
Cc: r-help@r-project.org
Subject: Re: [R] nls, reach limit bounds

  Have you plotted the data?  There are two standard sources of
non-convergence problems like this:  First, there may not be enough
information in your data to estimate all four parameters.  Second, if that's
not the case, then  your starting values are not appropriate. 


  I routinely use optim or nlminb to find a sensible solution,
because these general purpose optimizers are more likely to converge than
nls.  To do this, I write a function with a name like SSElogistic to
compute the sum of squares of residuals for your model.  I like to use
optim with hessian=TRUE.  Then I compute eigen(fit$hessian,
symmetric=TRUE), with fit = the object returned by optim.  If the
smallest eigenvalue is negative, it says that optim found a saddle point.
If the smallest eigenvalue is less than, e.g.,
1e-8 times the largest, it says that the smallest eigenvector is very poorly
estimated.  Round those numbers off grossly to 1 significant digit, and they
will likely suggest which parameter you can fix and drop from the model. 


  Hope this helps. 
  Spencer Graves


UyenThao Nguyen wrote:
 Hi,

 I am trying to fit a 4p logistic to this data, using nls function. The
function didn't freely converge; however, it converged if I put a lower and
an upper bound (in algorithm port). Also, the b1.A parameter always takes
value of the upper bound, which is very strange. Has anyone experienced
about non-convergent of nls and how to deal with this kind of problem?

 Thank you very much.



 3
y   x
 1  0.8924619 -0.31875876
 2  1.1814749 -0.21467016
 3  1.6148266  0.06069784
 4  2.2091363  0.54032947
 5  2.7019079  1.04921802
 6  3.0679585  1.60745502
 9  0.9436973 -0.31875876
 10 1.2201133 -0.21467016
 11 1.6470043  0.06069784
 12 2.2090048  0.54032947
 13 2.6864857  1.04921802
 14 3.0673523  1.60745502

 new.cont=nls.control(maxiter = 1, tol = 1e-05, minFactor = 1e-08,
 printEval = FALSE, warnOnly = FALSE)


 b0.A=.9*min(DAT$y)
 b1.A=1.1*max(DAT$y)-b0.A
 b2.A=-1*mean(DAT$x)
 b3.A=1


 b0.A
 b1.A
 b2.A
 b3.A

 nls.mdl.A=nls(y~b0 + b1/(1+exp(-b2-b3*x)),data=DAT,start = 
 list(b0=b0.A, b1=b1.A, b2=b2.A, b3=b3.A), lower=-10, upper=10, 
 algorithm=port,trace=T,control=new.cont)

 ##



   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to combine two rows (in a dataframe) into a third row?

2009-07-13 Thread Mark Na
Hi Henrique  other R-helpers,
Thank you for helping me last week. I used Henrique's suggestion to develop
some code (below) to combine two rows in my dataframe into a third row, and
then delete the original two rows. It works well.

My solution is not very elegant however; if there's a function (or a better
way) to accomplish this in 1-2 lines (rather than my 6) I'd appreciate
knowing about it.

Many thanks, Mark Na


#make some data for this example

data-data.frame(c(1A,1B),c(10,15))

names(data)-c(id,value)

data$value-as.numeric(as.character(data$value))


#combine two lines into one by summing their values in the value column

fixed-data.frame() #create empty data frame to hold fixed rows

fixed-rbind(fixed, aggregate(data[value],list(substr(data[,id],1,1)),
sum))

#copy previous line as necessary for other fixes

names(fixed)-c(id,value) #fix column names


#bind the fixed line to the main dataframe and delete the original lines

data-rbind(data,fixed) #add fixed lines to data

data-data[-which(c(1A,1B) %in% data$id),] #delete lines from data

rownames(data) - 1:nrow(data) #renumber rows



On Thu, Jul 9, 2009 at 5:58 PM, Henrique Dallazuanna www...@gmail.comwrote:

 Try this:

 aggregate(x[VALUE], list(substr(x[,ID], 1, 1)), sum)

 On Thu, Jul 9, 2009 at 7:27 PM, Mark Na mtb...@gmail.com wrote:

 Dear R-helpers,

 I have two rows in my dataframe:

 IDVALUE
 1A10
 1B15

 and I would like to combine these two rows into a single (new) row in my
 dataframe:

 IDVALUE
 125

 ...simply by specifying a new value for ID and summing the two VALUES.

 I have been trying to do this with with rbind, but it's not working.

 I'd appreciate any pointers.

 Thanks, Mark Na

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fortran function for quantiles

2009-07-13 Thread DHIMAN BHADRA
Hi,
I was wondering whether there is any Fortran function or associated library
for evaluating the quantiles of a set of values (something which the
R-function quantile() does). Any help will be much appreciated.
Thanks and regards,
Dhiman Bhadra

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Heckman Selection MOdel Help in R

2009-07-13 Thread Arne Henningsen
On Mon, Jul 13, 2009 at 4:26 PM, saurav pathakpathak.sau...@gmail.com wrote:
 I am using R 2.9.1,

That's good!

 I am not sure about the version of sampleSelection and maxLik

This is important! Please check the version numbers, e.g. with
R help(package=maxLik)
R help(package=sampleSelection)

BTW: Did you install the development version of the maxLik package
from R-Forge? If yes, please use the stable version that is available
on CRAN.

 Let em explain, In my data the DV used as 's' in the formula has some
 missing values, which can lead to bias in our case, so I used

 adpopdata$s - ifelse(is.na(Ln-oy5_1),0,1)

 ie I convert all the missing values for the variable ln_oy5_1 to 0 and all
 non-missings as 1, so is the source of missing values from the IVs used in
 the following:

 myProbit- glm(s ~ age + gender + gemedu + gemhinc + es_gdppc +
 +     imf_pop + estbbo_m, family = binomial(link = probit))

 so I dont know where the missing values are coming from, can you suggest how
 to correct it??

They might come from the explanatory variables. Please check, e.g. with
R sum( is.na( adpopdata$s ) )
R sum( is.na( adpopdata$age ) )
R sum( is.na( adpopdata$gender ) )
...
R sum( is.na( adpopdata$estbbo_m ) )

Please reply to all (i.e. including R-help) so that others who will
have similar questions and problems in the future could benefit from
our discussion.

Arne

 On Mon, Jul 13, 2009 at 3:09 PM, Arne Henningsen
 arne.henning...@googlemail.com wrote:

 On Mon, Jul 13, 2009 at 11:18 AM, Pathak,
 Sauravs.patha...@imperial.ac.uk wrote:
  Dear Arne
  I have gone through the paper and I have tried it at my end, I would
  really appreciate if you could address the following:
 
  1. Based upon your suggestion I used the following:
 
  regmod2 - selection(s ~ age + gender + gemedu + gemhinc + es_gdppc +
     imf_pop + estbbo_m, ln_oy5_1 ~ age+ gender+fearfail+gemedu,
  adpopdata, method = 2step)
  On trying the above( notice that I have changed heckit to selection
  in the above command, i get the following error message
 
  Error in coef.probit(result$probit) :
   could not find function coef.maxLik

 That's weird. Which versions of R, sampleSelection, and maxLik do you use?

  Before trying the above I tried the following:
 
  2. When I tried to do the Heckman selection model in stages , the first
  run was successful, I mean, using the following:
 
  myProbit- glm(s ~ age + gender + gemedu + gemhinc + es_gdppc +
  +     imf_pop + estbbo_m, family = binomial(link = probit))
  summary(myProbit)
 
  I am successful upto this point, but
 
  3. When I try calculating the IMR using the following:
  adpopdata$IMR-invMillsRatio(myProbit)$IMR1
 
  I get the error below
  Error in `$-.data.frame`(`*tmp*`, IMR, value = c(2.50039945424535,  :
   replacement has 257358 rows, data has 343251

 I guess that you have some NAs in the data so that you have the IMRs
 not for all observations but only for the observations witout NAs.

 R myIMRs - invMillsRatio(myProbit)$IMR1
 should work.

  Is there a code to calculate IMR by hand??

 Yes, inside invMillsRatio()
 However, why do you want to do this?

  what I see is that the number of rows of IMR calculated and the number
  of rows in the actual data set do not match (may be some missing
  value issues, I am not sure, if it is, how to fix it?) and hence IMR
  could
  not be added to my original data set, how do I fix this and then proceed
  to get correct IMR to use in my outcome equation  (the OLS stage)
 
  This is really taking a lot of time, I am working on it for weeks, can
  you please help me kindly, If you wish I can send you the data set as
  well

 Please try to fix it yourself.

 Arne

 
  -Original Message-
  From: Arne Henningsen [mailto:arne.henning...@googlemail.com]
  Sent: 13 July 2009 00:56
  To: Pathak, Saurav; r-help@r-project.org; otoo...@ut.ee
  Subject: Re: Heckman Selection MOdel Help in R
 
  Hi Saurav!
 
  On Sun, Jul 12, 2009 at 6:06 PM, Pathak,
  Sauravs.patha...@imperial.ac.uk wrote:
  I am new to R, I have to do a 2 step Heckman model, my selection
  equation is
  below which I was successful in running but I am unable to proceed
  further,
 
 
 
  I have so far used the following command
 
  glm(formula = s ~ age + gender + gemedu + gemhinc + es_gdppc +
      imf_pop + estbbo_m, family = binomial(link = probit))
 
  My question is
  1. How do i discard the non significant selection variables (one out of
  the
  seven variables above is non-significant) and calculate the Inverse
  Mills
  Ratio of the significant variables
 
  2. I need the inverse mills ratio from the above to run the outcome
  equation
  model using OLS with the Inverse mills ratio calculated on the basis of
  the
  above probit as the control in my outcome equation,  hence I need to
  get the
  IMR (Is there another direct way?)
 
  3. How can this be done in R using my concept or otherwise does there
  exist
  another way of doing what I wish to achieve
 
 
 
  On trying
 

[R] Rgraphviz ignoring outputorder attribute

2009-07-13 Thread Murat Tasan
Hi - since Rgraphviz was officially moved over to Bioconductor, this might
be a misguided post, but it's worth a shot:
I'm plotting a graph using Rgraphviz and in an attempt to force the edges to
be behind the nodes (on a fairly complex graph), I set the outputorder
graph attribute to edgesfirst (as is described in the graphviz docs here
- http://www.graphviz.org/doc/info/attrs.html).

This field does also appear (although with a short and cryptic description)
in the Rgraphviz package documentation.

However I still will get some stray edges that traverse over the nodes.
 (And yes, I've set the fillcolor to a non-transparent color.)  (Also,
yes, I'm setting the rest of the graph/edge/node properties correctly as all
of my other attribute settings are being honored by Rgraphviz.)

The problem occurs with multiple devices including pdf(), png(), and X11().

Anyone else ever have edges on top of nodes and find outputorder ignores
your request?

-Murat

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to run a R program with input arguments

2009-07-13 Thread cls59


edisonying wrote:
 
 I am a beginner in R and know only a little about it yet. I have a script
 written in R language, named as a.txt for example. I am using a Linux
 machine, at present I only know that I can type R in the terminal and
 then copy-paste the content in a.txt to the R's interface to execute the
 program. However, I want to know if there is any method that allows me to
 add some input arguments directly after the R in the terminal.
 Specifically, I want to type the following in the cmd line:
 
R (some flags or option) a.txt
 
 then the program will begin to run. Besides, if the program will read a
 data file first, can I also specify the data file in the command line?
 Then the complete command will become:
 
R  a.txt  data.txt
 
 This is important for a beginner. Thanks very much!:working:

 

We usually have R execute scripts in the following manner:

R --vanilla --slave --args [your args here]  scriptFile.R

Inside scriptFile.R the arguments can be retrieved by using
commandArgs(trailingOnly = T). The trailingOnly flag causes only the
arguments following --args to be be returned and not --args, --vanilla (no
save, no restore, quick startup) and --slave (makes R run quiet). 

Personally, I get tired of typing R --vanilla --slave --args all the time
and prefer to use Rscript. Since you are using Linux you can do the same by
putting the following hashbang at the top of your file:

#!/usr/bin/env Rscript

Then your script can be run using:

./scriptFile.R [your args here]

The arguments are still accessed inside the script using commandArgs(T)

Good luck!

-Charlie


-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://www.nabble.com/how-to-run-a-R-program-with-input-arguments-tp24465852p24471559.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SweaveListingUtils question

2009-07-13 Thread Karsten Weinert
Hello group,
recently I read about the SweaveListingUtils package and now I would like to
try it out. However I can not make it run...

Below is a minimal example. The problem seems the following line, generated
by the package:

\ifthenelse{\boolean{swe...@gin}}{\setkeys{gin}{width=0.6\textwidth}}{}%

If I comment it out, it works. What can I do about it?

I am not sure if this is the right mailing list for this question since it
touches both R and LaTeX. Sorry if I am wrong...

Any hint appreciated,
kind regards,

Karsten.

%% begin example
\documentclass[9pt]{beamer}

\usepackage{fancyvrb}
\usepackage{listings}

% choose language and char set
\usepackage[ngerman]{babel}
\usepackage[ansinew]{inputenc}

\SweaveOpts{prefix.string=Routput/parcel, keep.source=TRUE}

\begin{Scode}{results=tex, echo=FALSE}
require(SweaveListingUtils)
SweaveListingoptions(intermediate = FALSE)
SweaveListingPreparations()
\end{Scode}

\begin{document}

\begin{frame}[fragile]
\frametitle{Your title}

\begin{lstlisting}options(digits=3)
set.seed(1)

require(Hmisc)

age - rnorm(1000,50,10)
sex - sample(c('female','male'),1000,TRUE)
out - histbackback(split(age, sex), probability=TRUE, xlim=c(-.06,.06),
main = 'Back to Back Histogram')

#! just adding color
barplot(-out$left, col=red , horiz=TRUE, space=0, add=TRUE, axes=FALSE)
barplot(out$right, col=blue, horiz=TRUE, space=0, add=TRUE, axes=FALSE)
\end{lstlisting}
\end{frame}

\begin{Scode}{echo=FALSE}
  unloadNamespace(SweaveListingUtils)
\end{Scode}

\end{document}
%% end example

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Smoothing parameter or Bandwidth (h)

2009-07-13 Thread forestra

Hello to everybody,

I need to know the Smoothing Parameter to obtain Home Range of an animal
through the Area Kernel. I have 200 locations with x and y. How can I obtain
the Smoothing Parameter with R for LSCV, CV and Href method???

Thank you.   
-- 
View this message in context: 
http://www.nabble.com/Smoothing-parameter-or-Bandwidth-%28h%29-tp24470439p24470439.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem predict/poly

2009-07-13 Thread Alexander V Stolpovsky
Dear R experts,

I am observing undesired behavior of predict(fit, newdata), in case when fit 
object is produced by lm() involving a poly(). Here is how to reproduce:

x - c(1:10)
y - sin(c(1:10))
fit - lm(formula=y~poly(x, 5, raw=TRUE))
predict(fit, newdata=data.frame(x=c(1:10))) ## this works
predict(fit, newdata=data.frame(x=c(1:1)))  ## this is broken, error below

Error in poly(x, 5, raw = TRUE) :
  'degree' must be less than number of unique points

The problem is in poly():

if (raw) {
if (degree = length(unique(x)))
stop('degree' must be less than number of unique points)

This assertion is only warranted when poly is used to fit a model. But it is 
unnecessary and undesired if poly() is used to obtain prediction. Or am I 
missing something?
Why I would want to obtain model prediction for a single point is another 
story. I am filling a matrix of model values one element at a time, so that the 
matrix can then be given to persp() (I am fitting a two-variable model).

  Alex Stolpovsky



-
This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law.  If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED.  Although this transmission and
any attachments are believed to be free of any virus or other
defect that might affect any computer system into which it is
received and opened, it is the responsibility of the recipient to
ensure that it is virus free and no responsibility is accepted by
JPMorgan Chase  Co., its subsidiaries and affiliates, as
applicable, for any loss or damage arising in any way from its use.
 If you received this transmission in error, please immediately
contact the sender and destroy the material in its entirety,
whether in electronic or hard copy format. Thank you.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to run a R program with input arguments

2009-07-13 Thread edisonying

Thanks for your help. I found another cmd as follows:

   R CMD BATCH  infile  outfile

what's the difference between this expression and your suggestion?

cls59 wrote:
 
 
 edisonying wrote:
 
 I am a beginner in R and know only a little about it yet. I have a script
 written in R language, named as a.txt for example. I am using a Linux
 machine, at present I only know that I can type R in the terminal and
 then copy-paste the content in a.txt to the R's interface to execute
 the program. However, I want to know if there is any method that allows
 me to add some input arguments directly after the R in the terminal.
 Specifically, I want to type the following in the cmd line:
 
R (some flags or option) a.txt
 
 then the program will begin to run. Besides, if the program will read a
 data file first, can I also specify the data file in the command line?
 Then the complete command will become:
 
R  a.txt  data.txt
 
 This is important for a beginner. Thanks very much!:working:

 
 
 We usually have R execute scripts in the following manner:
 
 R --vanilla --slave --args [your args here]  scriptFile.R
 
 Inside scriptFile.R the arguments can be retrieved by using
 commandArgs(trailingOnly = T). The trailingOnly flag causes only the
 arguments following --args to be be returned and not --args, --vanilla (no
 save, no restore, quick startup) and --slave (makes R run quiet). 
 
 Personally, I get tired of typing R --vanilla --slave --args all the time
 and prefer to use Rscript. Since you are using Linux you can do the same
 by putting the following hashbang at the top of your file:
 
 #!/usr/bin/env Rscript
 
 Then your script can be run using:
 
 ./scriptFile.R [your args here]
 
 The arguments are still accessed inside the script using commandArgs(T)
 
 Good luck!
 
 -Charlie
 
 

-- 
View this message in context: 
http://www.nabble.com/how-to-run-a-R-program-with-input-arguments-tp24465852p24472117.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >