Re: [R] Total effect of X on Y under presence of interaction effects

2011-05-12 Thread vioravis
This is what I believe is referred to as supression in regression, where
the correlation correlation between the independent and the dependent
variable turns out to be of one sign whereas the regression coefficient
turns out to be of the opposite sign. 

Read here about supression:

http://www.uvm.edu/~dhowell/gradstat/psych341/lectures/MultipleRegression/multreg3.html

HTH



--
View this message in context: 
http://r.789695.n4.nabble.com/Total-effect-of-X-on-Y-under-presence-of-interaction-effects-tp3514137p3516446.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lm and anova

2011-05-12 Thread Sara Sjöstedt de Luna
Hi!

We have run a linear regression model with 3 explanatory variables and get the 
output below.
Does anyone know what type of test the anova model below does and why we get so 
different result in terms of significant variables by the two tables?

Thanks!

/Sara

 summary(model)
Call:
lm(formula = log(HOBU) ~ Vole1 + Volelag + Year)
Residuals:
  Min1QMedian3Q   Max
-0.757284 -0.166681  0.009478  0.181304  0.692916
Coefficients:
 Estimate Std. Error t value Pr(|t|)
(Intercept) 80.041737  12.018726   6.660 1.40e-07 ***
Vole10.005521   0.041626   0.133   0.8953
Volelag  0.033966   0.018392   1.847   0.0738 .
Year-0.035927   0.006027  -5.961 1.08e-06 ***

anova(model)
Analysis of Variance Table
Response: log(HOBU)
  Df Sum Sq Mean Sq F valuePr(F)
Vole1  1 1.7877  1.7877 13.1772 0.0009486 ***
Volelag1 0.5817  0.5817  4.2878 0.0462831 *
Year   1 4.8205  4.8205 35.5323 1.082e-06 ***
Residuals 33 4.4769  0.1357


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to extract information from the following dataset?

2011-05-12 Thread Xin Zhang
Hi all,

I have never worked with this kind of data before, so Please help me out
with it.
I have the following data set, in a csv file, looks like the following:

Jan 27, 2010  16:01:24,000 125 - - -
Jan 27, 2010  16:06:24,000 125 - - -
Jan 27, 2010  16:11:24,000 176 - - -
Jan 27, 2010  16:16:25,000 159 - - -
Jan 27, 2010  16:21:25,000 142 - - -
Jan 27, 2010  16:26:24,000 142 - - -
Jan 27, 2010  16:31:24,000 125 - - -
Jan 27, 2010  16:36:24,000 125 - - -
Jan 27, 2010  16:41:24,000 125 - - -
Jan 27, 2010  16:46:24,000 125 - - -
Jan 27, 2010  16:51:24,000 125 - - -
Jan 27, 2010  16:56:24,000 125 - - -
Jan 27, 2010  17:01:24,000 157 - - -
Jan 27, 2010  17:06:24,000 172 - - -
Jan 27, 2010  17:11:25,000 142 - - -
Jan 27, 2010  17:16:24,000 125 - - -
Jan 27, 2010  17:21:24,000 125 - - -
Jan 27, 2010  17:26:24,000 125 - - -
Jan 27, 2010  17:31:24,000 125 - - -
Jan 27, 2010  17:36:24,000 125 - - -
Jan 27, 2010  17:41:24,000 125 - - -
Jan 27, 2010  17:46:24,000 125 - - -
Jan 27, 2010  17:51:24,000 125 - - -
..

The first few columns are month, day, year, time with OS3 accuracy. And the
last number is the measurement I need to extract.
I wonder if there is a easy way to just take out the measurements only from
a specific day and hour, i.e. if I want measurements from Jan 27 2010
16:--:--
then I get 125,125,176,159,142,142,125,125,125,125,125,125.
Many thanks!!

-- 
Xin Zhang
Ph.D Candidate
Department of Statistics
University of California, Riverside

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to extract information from the following dataset?

2011-05-12 Thread Jose-Marcio Martins da Cruz

Xin Zhang wrote:

Hi all,

I have never worked with this kind of data before, so Please help me out
with it.
I have the following data set, in a csv file, looks like the following:

Jan 27, 2010  16:01:24,000 125 - - -
Jan 27, 2010  16:06:24,000 125 - - -
Jan 27, 2010  16:11:24,000 176 - - -
Jan 27, 2010  16:16:25,000 159 - - -
Jan 27, 2010  16:21:25,000 142 - - -
Jan 27, 2010  16:26:24,000 142 - - -
Jan 27, 2010  16:31:24,000 125 - - -
Jan 27, 2010  16:36:24,000 125 - - -
Jan 27, 2010  16:41:24,000 125 - - -
Jan 27, 2010  16:46:24,000 125 - - -
Jan 27, 2010  16:51:24,000 125 - - -
Jan 27, 2010  16:56:24,000 125 - - -
Jan 27, 2010  17:01:24,000 157 - - -
Jan 27, 2010  17:06:24,000 172 - - -
Jan 27, 2010  17:11:25,000 142 - - -
Jan 27, 2010  17:16:24,000 125 - - -
Jan 27, 2010  17:21:24,000 125 - - -
Jan 27, 2010  17:26:24,000 125 - - -
Jan 27, 2010  17:31:24,000 125 - - -
Jan 27, 2010  17:36:24,000 125 - - -
Jan 27, 2010  17:41:24,000 125 - - -
Jan 27, 2010  17:46:24,000 125 - - -
Jan 27, 2010  17:51:24,000 125 - - -
..

The first few columns are month, day, year, time with OS3 accuracy. And the
last number is the measurement I need to extract.
I wonder if there is a easy way to just take out the measurements only from
a specific day and hour, i.e. if I want measurements from Jan 27 2010
16:--:--
then I get 125,125,176,159,142,142,125,125,125,125,125,125.
Many thanks!!


The easiest is in the shell, if you're using some flavour of unix :

grep Jan 27, 2010  16 filein.txt | awk '{print $5}'  fileout.txt

and use fileout which will contain only the column of data you want.





--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Binomial

2011-05-12 Thread blutack
Hi, I need to create a function which generates a Binomial random number
without using the rbinom function. Do I need to use the choose function or
am I better just using a sample?
Thanks. 

--
View this message in context: 
http://r.789695.n4.nabble.com/Binomial-tp3516778p3516778.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Binomial

2011-05-12 Thread Alexander Engelhardt

Am 12.05.2011 10:46, schrieb blutack:

Hi, I need to create a function which generates a Binomial random number
without using the rbinom function. Do I need to use the choose function or
am I better just using a sample?
Thanks.


I think I remember other software who generates binomial data with e.g. 
pi=0.7 by


pi - 0.7
x - runif(100)pi
summary(x)

 -- Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Snow/Snowfall hangs on windows 7

2011-05-12 Thread Uwe Ligges



On 28.04.2011 09:57, Truc wrote:

Dear Anna !

I have the same problem with Window 7 - 64 bits.
If I use R 2.12.2 with snow packages 0.3-3. It works well. But with R 2.13.0
with the same snow packages.
It just hang. I start R (Run as administrator), turn off firewall ... But it
seems R .13.0 version of socket connect to window has been changed ???
That 's my experience so far.



I was just on a Windows 7 64bit machine and tried to verify some older 
reports. For this one:


the example in ?parApply

library(snow)
cl - makeSOCKcluster(c(localhost,localhost))
parSapply(cl, 1:20, get(+), 3)

works fine with R-2.13.0 in 32-bit and 64-bit and snow 0.3-3. Since you 
have not given a single line of code, it is hard to help.


Uwe Ligges




--
View this message in context: 
http://r.789695.n4.nabble.com/Snow-Snowfall-hangs-on-windows-7-tp3436724p3480368.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Binomial

2011-05-12 Thread Ted Harding
On 12-May-11 09:02:45, Alexander Engelhardt wrote:
 Am 12.05.2011 10:46, schrieb blutack:
 Hi, I need to create a function which generates a Binomial random
 number
 without using the rbinom function. Do I need to use the choose
 function or
 am I better just using a sample?
 Thanks.
 
 I think I remember other software who generates binomial data
 with e.g. pi=0.7 by
 
 pi - 0.7
 x - runif(100)pi
 summary(x)
 
   -- Alex

That needs to be the other way round (and perhaps also convert
it to 0/1):

  x - 1*(runif(100)  pi)

since Prob(runif  pi) = (1 - pi).

Comparison:

  pi - 0.7
  x - runif(100)pi
  x[1:10]
  #  [1] FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE
  sum(x)/100
  # [1] 0.36

  x - 1*(runif(100)  pi)
  x[1:10]
  # [1] 0 0 1 1 1 1 1 0 1 0
  sum(x)/100
  # [1] 0.62

Ted


E-Mail: (Ted Harding) ted.hard...@wlandres.net
Fax-to-email: +44 (0)870 094 0861
Date: 12-May-11   Time: 10:21:26
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to extract information from the following dataset?

2011-05-12 Thread Mike Marchywka













 Date: Thu, 12 May 2011 10:43:59 +0200
 From: jose-marcio.mart...@mines-paristech.fr
 To: xzhan...@ucr.edu
 CC: r-help@r-project.org
 Subject: Re: [R] How to extract information from the following dataset?

 Xin Zhang wrote:
  Hi all,
 
  I have never worked with this kind of data before, so Please help me out
  with it.
  I have the following data set, in a csv file, looks like the following:
 
  Jan 27, 2010 16:01:24,000 125 - - -
  Jan 27, 2010 16:06:24,000 125 - - -
  Jan 27, 2010 16:11:24,000 176 - - -
  Jan 27, 2010 16:16:25,000 159 - - -
  Jan 27, 2010 16:21:25,000 142 - - -
  Jan 27, 2010 16:26:24,000 142 - - -
  Jan 27, 2010 16:31:24,000 125 - - -
  Jan 27, 2010 16:36:24,000 125 - - -
  Jan 27, 2010 16:41:24,000 125 - - -
  Jan 27, 2010 16:46:24,000 125 - - -
  Jan 27, 2010 16:51:24,000 125 - - -
  Jan 27, 2010 16:56:24,000 125 - - -
  Jan 27, 2010 17:01:24,000 157 - - -
  Jan 27, 2010 17:06:24,000 172 - - -
  Jan 27, 2010 17:11:25,000 142 - - -
  Jan 27, 2010 17:16:24,000 125 - - -
  Jan 27, 2010 17:21:24,000 125 - - -
  Jan 27, 2010 17:26:24,000 125 - - -
  Jan 27, 2010 17:31:24,000 125 - - -
  Jan 27, 2010 17:36:24,000 125 - - -
  Jan 27, 2010 17:41:24,000 125 - - -
  Jan 27, 2010 17:46:24,000 125 - - -
  Jan 27, 2010 17:51:24,000 125 - - -
  ..
 
  The first few columns are month, day, year, time with OS3 accuracy. And the
  last number is the measurement I need to extract.
  I wonder if there is a easy way to just take out the measurements only from
  a specific day and hour, i.e. if I want measurements from Jan 27 2010
  16:--:--
  then I get 125,125,176,159,142,142,125,125,125,125,125,125.
  Many thanks!!

 The easiest is in the shell, if you're using some flavour of unix :

 grep Jan 27, 2010 16 filein.txt | awk '{print $5}'  fileout.txt

 and use fileout which will contain only the column of data you want.

Nomrally that is what I do but the R POSIXct features work pretty easily.
I guess I'd use bash text processing commands to put the data into a 
form you like, perhaps y-mo-day time  and then read it in in as data frame.
Usually I convert everything to time since epoch began because I like integers
but there are some facilities here like round that work well with date-times.

 dx-as.POSIXct(2011-04-03 13:14:15)
 dx
[1] 2011-04-03 13:14:15 CDT
 round(dx,hour)
[1] 2011-04-03 13:00:00 CDT
 as.integer(dx)
[1] 1301854455


  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R won't start keeps crashing

2011-05-12 Thread Bazman76
Hi there,

I am reletively new user I only dowloaded is about a week ago.

I was getting along fine but last night I tried to selected save workspace.

Since then R will not work And I really really need it.

There are two eror massages:

The first if a pop up box is Fatal error: unable to resolve data in
.RData.

The second is a message in the GUI: Error in loadNamespace(name) there is
no package called vars.

When I click to dismiss the message box the whole thing just shuts down!!

I have tried reinstalling but it has made no difference.

Please Help





--
View this message in context: 
http://r.789695.n4.nabble.com/R-won-t-start-keeps-crashing-tp3516829p3516829.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm and anova

2011-05-12 Thread Sara Sjöstedt de Luna
Hi!

We have run a linear regression model with 3 explanatory variables and get the 
output below.
Does anyone know what type of test the anova model below does and why we get so 
different result in terms of significant variables by the two tables?

Thanks!

/Sara

 summary(model)
Call:
lm(formula = log(HOBU) ~ Vole1 + Volelag + Year)
Residuals:
  Min1QMedian3Q   Max
-0.757284 -0.166681  0.009478  0.181304  0.692916
Coefficients:
 Estimate Std. Error t value Pr(|t|)
(Intercept) 80.041737  12.018726   6.660 1.40e-07 ***
Vole10.005521   0.041626   0.133   0.8953
Volelag  0.033966   0.018392   1.847   0.0738 .
Year-0.035927   0.006027  -5.961 1.08e-06 ***

anova(model)
Analysis of Variance Table
Response: log(HOBU)
  Df Sum Sq Mean Sq F valuePr(F)
Vole1  1 1.7877  1.7877 13.1772 0.0009486 ***
Volelag1 0.5817  0.5817  4.2878 0.0462831 *
Year   1 4.8205  4.8205 35.5323 1.082e-06 ***
Residuals 33 4.4769  0.1357


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to extract information from the following dataset?

2011-05-12 Thread hwright

I have the following data set, in a csv file, looks like the following:

Jan 27, 2010  16:01:24,000 125 - - -
Jan 27, 2010  16:06:24,000 125 - - -
..
The first few columns are month, day, year, time with OS3 accuracy. And the
last number is the measurement I need to extract.
I wonder if there is a easy way to just take out the measurements only from
a specific day and hour
-- 
Xin Zhang
Ph.D Candidate
Department of Statistics
University of California, Riverside
---

I use strptime to configure the date format in my times series dataset.
First check to see how the dates are read.
For example:
# check the structure
str(your_file)
'data.frame' ...etc
This tells me that my original date is a factor but not in POSIXlt format.

#check your column dates
head(your_file)
[1] 1984-01-26 1984-02-09 1984-03-01 1984-03-15 1984-03-29
1984-04-12
These are discrete column dates.

#convert your date format
your_file$date- strptime(your_file$date,%m/%d/%Y)
call ?strptime for options

Example:
For a specific day or hour, strptime would utilize:
strptime(your_file$date,%d/%I) for day and hour.

Once you extract the type of date format you want, run str(your_file) again
to confirm the format change.
Does this answer your question?
Best,



-
---
Heather A. Wright, PhD candidate
Ecology and Evolution of Plankton
Stazione Zoologica Anton Dohrn
Villa Comunale
80121 - Napoli, Italy
--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-extract-information-from-the-following-dataset-tp3516752p3516952.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Total effect of X on Y under presence of interaction effects

2011-05-12 Thread Frank Harrell
I second David's first reply regarding the non-utility of individual
coefficients, especially for low-order terms.  Also, nonlinearity can be
quite important.  Properly modeling main effects through the use of flexible
nonlinear functions can sometimes do away with the need for interaction
terms.

Back to the original question, it is easy to get total effects for each
predictor.  The anova function in the rms package does this, by combining
lower and higher-order effects (main effects + interactions).
Frank

David Winsemius wrote:
 
 On May 11, 2011, at 6:26 PM, Matthew Keller wrote:
 
 Not to rehash an old statistical argument, but I think David's reply
 here is too strong (In the presence of interactions there is little
 point in attempting to assign meaning to individual coefficients.).
 As David notes, the simple effect of your coefficients (e.g., a) has
 an interpretation: it is the predicted effect of a when b, c, and d
 are zero. If the zero-level of b, c, and d are meaningful (e.g., if
 you have centered all your variables such that the mean of each one is
 zero), then the coefficient of a is the predicted slope of a at the
 mean level of all other predictors...
 
 And there is internal evidence that such a procedure was not performed  
 in this instance. I think my advice applies here.
 
 -- 
 David.

 Matt



 On Wed, May 11, 2011 at 2:40 PM, Greg Snow lt;greg.s...@imail.orggt;  
 wrote:
 Just to add to what David already said, you might want to look at  
 the Predict.Plot and TkPredict functions in the TeachingDemos  
 package for a simple interface for visualizing predicted values in  
 regression models.

 These plots are much more informative than a single number trying  
 to capture total effect.

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of David Winsemius
 Sent: Wednesday, May 11, 2011 7:48 AM
 To: Michael Haenlein
 Cc: r-help@r-project.org
 Subject: Re: [R] Total effect of X on Y under presence of  
 interaction
 effects


 On May 11, 2011, at 4:26 AM, Michael Haenlein wrote:

 Dear all,

 this is probably more a statistics question than an R question but
 probably
 there is somebody who can help me nevertheless.

 I'm running a regression with four predictors (a, b, c, d) and all
 their
 interaction effects using lm. Based on theory I assume that a
 influences y
 positively. In my output (see below) I see, however, a negative
 regression
 coefficient for a. But several of the interaction effects of a with
 b, c and
 d have positive signs. I don't really understand this. Do I have to
 add up
 the coefficient for the main effect and the ones of all interaction
 effects
 to get a total effect of a on y? Or am I doing something wrong  
 here?

 In the presence of interactions there is little point in  
 attempting to
 assign meaning to individual coefficients. You need to use predict()
 (possibly with graphical or tabular displays) and produce  
 estimates of
 one or two variable at relevant levels of  the other variables.

 The other aspect about which your model is not informative, is the
 possibility that some of these predictors have non-linear  
 associations
 with `y`.

 (The coefficient for `a` examined in isolation might apply to a  
 group
 of subjects (or other units of analysis) in which the values of `b`,
 `c`, and `d` were all held at zero. Is that even a situation that
 would occur in your domain of investigation?)

 --
 David.

 Thanks very much for your answer in advance,

 Regards,

 Michael


 Michael Haenlein
 Associate Professor of Marketing
 ESCP Europe
 Paris, France



 Call:
 lm(formula = y ~ a * b * c * d)

 Residuals:
Min  1Q  Median  3Q Max
 -44.919  -5.184   0.294   5.232 115.984

 Coefficients:
Estimate Std. Error t value Pr(|t|)
 (Intercept)  27.3067 0.8181  33.379   2e-16 ***
 a   -11.0524 2.0602  -5.365 8.25e-08 ***
 b-2.5950 0.4287  -6.053 1.47e-09 ***
 c   -22.0025 2.8833  -7.631 2.50e-14 ***
 d20.5037 0.3189  64.292   2e-16 ***
 a:b  15.1411 1.1862  12.764   2e-16 ***
 a:c  26.8415 7.2484   3.703 0.000214 ***
 b:c   8.3127 1.5080   5.512 3.61e-08 ***
 a:d   6.6221 0.8061   8.215 2.33e-16 ***
 b:d  -2.0449 0.1629 -12.550   2e-16 ***
 c:d  10.0454 1.1506   8.731   2e-16 ***
 a:b:c 1.4137 4.1579   0.340 0.733862
 a:b:d-6.1547 0.4572 -13.463   2e-16 ***
 a:c:d   -20.6848 2.8832  -7.174 7.69e-13 ***
 b:c:d-3.4864 0.6041  -5.772 8.05e-09 ***
 a:b:c:d   5.6184 1.6539   3.397 0.000683 ***
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Residual standard error: 7.913 on 12272 degrees of freedom
 Multiple R-squared: 0.8845, Adjusted 

[R] strength of seasonal component

2011-05-12 Thread SNV Krishna
Hi All,
 
a) Is it possible to estimate the strength of seasonality in timeseries
data. Say I have monthly mean prices of an ten different assets. I decompose
the data using stl() and obtain the seasonal parameter for each month. Is it
possible to order the assets based on the strength of seasonality?
 
b) which gives a better estimate on seasonality stl() or a robust linear
model like MASS::rlm(mean price ~ month), considering the fact that the
variable analysed is price series.
 
Many thanks for the insight and help
 
Regards,
 
Krishna
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R won't start keeps crashing

2011-05-12 Thread Uwe Ligges

Delete the file .RData in your working directory and try to start R again.

Uwe Ligges




On 12.05.2011 11:09, Bazman76 wrote:

Hi there,

I am reletively new user I only dowloaded is about a week ago.

I was getting along fine but last night I tried to selected save workspace.

Since then R will not work And I really really need it.

There are two eror massages:

The first if a pop up box is Fatal error: unable to resolve data in
.RData.

The second is a message in the GUI: Error in loadNamespace(name) there is
no package called vars.

When I click to dismiss the message box the whole thing just shuts down!!

I have tried reinstalling but it has made no difference.

Please Help





--
View this message in context: 
http://r.789695.n4.nabble.com/R-won-t-start-keeps-crashing-tp3516829p3516829.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to fit a random data into Beta distribution?

2011-05-12 Thread David Winsemius


On May 11, 2011, at 11:17 PM, MikeK wrote:


I am also trying to fit data to a beta distribution.

In Ang and Tang, Probability Concepts in Engineering, 2nd Ed., page  
127-9,
they describe a variant of a beta distribution with additional  
parameters
than the standard beta distribution, enabling specification of a max  
and min

value other than 0,1. This would be very useful for my purposes.

Any thoughts on how to fit a distribution directly to this variant  
of the

beta distribution, without starting from scratch?



Scale your data to [0,1], fit, predict, invert the scaling.

xscaled - (x-min(x))/max(x)

xrescaled - max(x)*xscaled + min(x)

(Better check that I made the correct order of those operations. The  
first attempt was wrong ... I think.)


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] deiversity / density

2011-05-12 Thread Matevž Pavlič
Hi all, 

 

 

I have a point data set (SHP) with coordinates and a attribute (i.e. type of 
point). 

 

These points are scattered around a fairly big area. What i would like to do is 
to find a sub-area where density of points sombined with the diversity of type 
is the biggest. 

 

Does anyone have any idea if this is somehowe possible to do in R? Any idea 
would be greatly aprpeciated, 

 

 

m

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to extract information from the following dataset?

2011-05-12 Thread John Kane
? subset  day = x time  y | time  z

--- On Thu, 5/12/11, hwright heather.wri...@maine.edu wrote:

 From: hwright heather.wri...@maine.edu
 Subject: Re: [R] How to extract information from the following dataset?
 To: r-help@r-project.org
 Received: Thursday, May 12, 2011, 6:18 AM
 
 I have the following data set, in a csv file, looks like
 the following:
 
 Jan 27, 2010  16:01:24,000 125 - - -
 Jan 27, 2010  16:06:24,000 125 - - -
 ..
 The first few columns are month, day, year, time with OS3
 accuracy. And the
 last number is the measurement I need to extract.
 I wonder if there is a easy way to just take out the
 measurements only from
 a specific day and hour
 -- 
 Xin Zhang
 Ph.D Candidate
 Department of Statistics
 University of California, Riverside
 ---
 
 I use strptime to configure the date format in my times
 series dataset.
 First check to see how the dates are read.
 For example:
 # check the structure
 str(your_file)
 'data.frame' ...etc
 This tells me that my original date is a factor but not in
 POSIXlt format.
 
 #check your column dates
 head(your_file)
 [1] 1984-01-26 1984-02-09 1984-03-01 1984-03-15
 1984-03-29
 1984-04-12
 These are discrete column dates.
 
 #convert your date format
 your_file$date- strptime(your_file$date,%m/%d/%Y)
 call ?strptime for options
 
 Example:
 For a specific day or hour, strptime would utilize:
 strptime(your_file$date,%d/%I) for day and hour.
 
 Once you extract the type of date format you want, run
 str(your_file) again
 to confirm the format change.
 Does this answer your question?
 Best,
 
 
 
 -
 ---
 Heather A. Wright, PhD candidate
 Ecology and Evolution of Plankton
 Stazione Zoologica Anton Dohrn
 Villa Comunale
 80121 - Napoli, Italy
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/How-to-extract-information-from-the-following-dataset-tp3516752p3516952.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R won't start keeps crashing

2011-05-12 Thread Bazman76
OK I did a seach for the files and got:

.Rdata which is 206KB
Canada.Rdata which is 3kB

If I click on .Rdata I get the crash.

If I click on Canada.Rdata the system starts?

also they are stored in different places?

.Rdata is in My documents
Canada.RData is in My Documents\vars\vars\data

I assume I should delete .Rdata, should I also delete the other file?
Where should these files be stored?

I think the .RData was a result of my trying to save the workspace. Are
there files being saved to the correct location?

Thanks for your help


--
View this message in context: 
http://r.789695.n4.nabble.com/R-won-t-start-keeps-crashing-tp3516829p3517115.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Binomial

2011-05-12 Thread Sarah Sanchez
Dear R helpers,

I am raising one query regarding this Binomial thread with the sole intention 
of learning something more as I understand R forum is an ocean of knowledge.

I was going through all the responses, but wondered that original query was 
about generating Binomial random numbers while what the R code produced so far 
generates the Bernoulli Random no.s i.e. 0 and 1. 

True Binomial distribution is nothing but no of Bernoulli trials. As I said I 
am a moron and don't understand much about Statistics. Just couldn't stop from 
asking my stupid question.

Regards

Sarah

--- On Thu, 5/12/11, David Winsemius dwinsem...@comcast.net wrote:

From: David Winsemius dwinsem...@comcast.net
Subject: Re: [R] Binomial
To: Alexander Engelhardt a...@chaotic-neutral.de
Cc: r-help@r-project.org, blutack x-jess-...@hotmail.co.uk
Date: Thursday, May 12, 2011, 11:08 AM


On May 12, 2011, at 5:02 AM, Alexander Engelhardt wrote:

 Am 12.05.2011 10:46, schrieb blutack:
 Hi, I need to create a function which generates a Binomial random number
 without using the rbinom function. Do I need to use the choose function or
 am I better just using a sample?
 Thanks.
 
 I think I remember other software who generates binomial data with e.g. 
 pi=0.7 by
 
 pi - 0.7

I hope Allan knows this and is just being humorous here,  but for the less 
experienced in the audience ... Choosing a different threshold variable name 
might be less error prone. `pi` is one of few built-in constants in R and there 
may be code that depends on that fact.

 pi
[1] 3.141593
 pi - 0.7
 pi
[1] 0.7
 rm(pi)
 pi
[1] 3.141593

 x - runif(100)pi
 summary(x)

Another method would be:

 x - sample(c(0,1) , 100, replace=TRUE,  prob=c(0.7, 0.3) )

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] maximum likelihood convergence reproducing Anderson Blundell 1982 Econometrica R vs Stata

2011-05-12 Thread Mike Marchywka




So what was the final verdict on this discussion? I kind of 
lost track if anyone has a minute to summarize and critique my summary below.


Apparently there were two issues, the comparison between R and Stata
was one issue and the optimum solution another. As I understand it,
there was some question about R numerical gradient calculation. This would
suggest some features of the function may be of interest to consider. 

The function to be optimized appears to be, as OP stated, 
some function of residuals of two ( unrelated ) fits. 
The residual vectors e1 and e2 are dotted in various combinations
creating a matrix whose determinant is (e1.e1)(e2.e2)-(e1.e2)^2 which
is the result to be minimized by choice of theta. Theta it seems is
an 8 component vector, 4 components determine e1 and the other 4 e2.
Presumably a unique solution would require that e1 and e2, both n-component 
vectors,
 point in different directions or else both could become aribtarily large
while keeping the error signal at zero. For fixed magnitudes, colinearity
would reduce the Error.  The intent would appear to be to 
keep the residuals distributed similarly in the two ( unrelated) fits. 
 I guess my question is,
 did anyone determine that there is a unique solution? or
am I totally wrong here ( I haven't used these myself to any
extent and just try to run some simple teaching examples, asking
for my own clarification as much as anything).

Thanks.











 From: rvarad...@jhmi.edu
 To: pda...@gmail.com; alex.ols...@gmail.com
 Date: Sat, 7 May 2011 11:51:56 -0400
 CC: r-help@r-project.org
 Subject: Re: [R] maximum likelihood convergence reproducing Anderson Blundell 
 1982 Econometrica R vs Stata

 There is something strange in this problem. I think the log-likelihood is 
 incorrect. See the results below from optimx. You can get much larger 
 log-likelihood values than for the exact solution that Peter provided.

 ## model 18
 lnl - function(theta,y1, y2, x1, x2, x3) {
 n - length(y1)
 beta - theta[1:8]
 e1 - y1 - theta[1] - theta[2]*x1 - theta[3]*x2 - theta[4]*x3
 e2 - y2 - theta[5] - theta[6]*x1 - theta[7]*x2 - theta[8]*x3
 e - cbind(e1, e2)
 sigma - t(e)%*%e
 logl - -1*n/2*(2*(1+log(2*pi)) + log(det(sigma))) # it looks like there is 
 something wrong here
 return(-logl)
 }

 data - read.table(e:/computing/optimx_example.dat, header=TRUE, sep=,)

 attach(data)

 require(optimx)

 start - c(coef(lm(y1~x1+x2+x3)), coef(lm(y2~x1+x2+x3)))

 # the warnings can be safely ignored in the optimx calls
 p1 - optimx(start, lnl, hessian=TRUE, y1=y1, y2=y2,
 + x1=x1, x2=x2, x3=x3, control=list(all.methods=TRUE, maxit=1500))

 p2 - optimx(rep(0,8), lnl, hessian=TRUE, y1=y1, y2=y2,
 + x1=x1, x2=x2, x3=x3, control=list(all.methods=TRUE, maxit=1500))

 p3 - optimx(rep(0.5,8), lnl, hessian=TRUE, y1=y1, y2=y2,
 + x1=x1, x2=x2, x3=x3, control=list(all.methods=TRUE, maxit=1500))

 Ravi.
 
 From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf 
 Of peter dalgaard [pda...@gmail.com]
 Sent: Saturday, May 07, 2011 4:46 AM
 To: Alex Olssen
 Cc: r-help@r-project.org
 Subject: Re: [R] maximum likelihood convergence reproducing Anderson Blundell 
 1982 Econometrica R vs Stata

 On May 6, 2011, at 14:29 , Alex Olssen wrote:

  Dear R-help,
 
  I am trying to reproduce some results presented in a paper by Anderson
  and Blundell in 1982 in Econometrica using R.
  The estimation I want to reproduce concerns maximum likelihood
  estimation of a singular equation system.
  I can estimate the static model successfully in Stata but for the
  dynamic models I have difficulty getting convergence.
  My R program which uses the same likelihood function as in Stata has
  convergence properties even for the static case.
 
  I have copied my R program and the data below. I realise the code
  could be made more elegant - but it is short enough.
 
  Any ideas would be highly appreciated.

 Better starting values would help. In this case, almost too good values are 
 available:

 start - c(coef(lm(y1~x1+x2+x3)), coef(lm(y2~x1+x2+x3)))

 which appears to be the _exact_ solution.

 Apart from that, it seems that the conjugate gradient methods have 
 difficulties with this likelihood, for some less than obvious reason. 
 Increasing the maxit gets you closer but still not satisfactory.

 I would suggest trying out the experimental optimx package. Apparently, some 
 of the algorithms in there are much better at handling this likelihood, 
 notably nlm and nlminb.




 
  ## model 18
  lnl - function(theta,y1, y2, x1, x2, x3) {
  n - length(y1)
  beta - theta[1:8]
  e1 - y1 - theta[1] - theta[2]*x1 - theta[3]*x2 - theta[4]*x3
  e2 - y2 - theta[5] - theta[6]*x1 - theta[7]*x2 - theta[8]*x3
  e - cbind(e1, e2)
  sigma - t(e)%*%e
  logl - -1*n/2*(2*(1+log(2*pi)) + log(det(sigma)))
  return(-logl)
  }
  p - optim(0*c(1:8), lnl, method=BFGS, hessian=TRUE, y1=y1, y2=y2,
  x1=x1, 

[R] Simple order() data frame question.

2011-05-12 Thread John Kane
Clearly, I don't understand what order() is doing and as ususl the help for 
order seems to only confuse me more. For some reason I just don't follow the 
examples there. I must be missing something about the data frame sort there but 
what?

I originally wanted to  reverse-order my data frame df1 (see below) by aa (a 
factor) but since this was not working I decided to simplify and order by bb to 
see what was haqppening!!

I'm obviously doing something stupid but what? 

(df1 - data.frame(aa=letters[1:10],
 bb=rnorm(10)))
# Order in acending order by bb
(df1[order(df1[,2]),] ) # seems to work fine

# Order in decending order by bb.
(df1[order(df1[,-2]),])  # does not seem to work


===
 sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252
LC_MONETARY=English_Canada.1252
[4] LC_NUMERIC=CLC_TIME=English_Canada.1252

attached base packages:
 [1] grid  grDevices datasets  splines   graphics  stats tcltk 
utils methods   base 

other attached packages:
[1] ggplot2_0.8.9   proto_0.3-9.2   reshape_0.8.4   plyr_1.5.2  
svSocket_0.9-51 TinnR_1.0.3 R2HTML_2.2 
[8] Hmisc_3.8-3 survival_2.36-9

loaded via a namespace (and not attached):
[1] cluster_1.13.3  lattice_0.19-26 svMisc_0.9-61   tools_2.13.0   


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple order() data frame question.

2011-05-12 Thread Patrick Breheny

On 05/12/2011 08:32 AM, John Kane wrote:

Clearly, I don't understand what order() is doing and as ususl the help for 
order seems to only confuse me more. For some reason I just don't follow the 
examples there. I must be missing something about the data frame sort there but 
what?

I originally wanted to  reverse-order my data frame df1 (see below) by aa (a 
factor) but since this was not working I decided to simplify and order by bb to 
see what was haqppening!!

I'm obviously doing something stupid but what?

(df1- data.frame(aa=letters[1:10],
  bb=rnorm(10)))
# Order in acending order by bb
(df1[order(df1[,2]),] ) # seems to work fine

# Order in decending order by bb.
(df1[order(df1[,-2]),])  # does not seem to work



There is a 'decreasing' option described in the help file for 'order' 
which does what you want:


df1- data.frame(aa=letters[1:10],bb=rnorm(10))
df1[order(df1[,2],decreasing=TRUE),]

   aa  bb
6   f  3.16449690
7   g  2.44362935
8   h  0.80990322
1   a  0.06365513
5   e -0.33932586
9   i -0.52119533
2   b -0.65623164
4   d -0.86918700
3   c -1.86750927
10  j -2.21178676

df1[order(df1[,1],decreasing=TRUE),]

   aa  bb
10  j -2.21178676
9   i -0.52119533
8   h  0.80990322
7   g  2.44362935
6   f  3.16449690
5   e -0.33932586
4   d -0.86918700
3   c -1.86750927
2   b -0.65623164
1   a  0.06365513

The expression 'df1[,-2]' removes the second column from df1; clearly 
not what you want here.


--
Patrick Breheny
Assistant Professor
Department of Biostatistics
Department of Statistics
University of Kentucky

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple order() data frame question.

2011-05-12 Thread Nick Sabbe
Try
(df1[order(-df1[,2]),])
Adding the minus within the [ leaves out the column (in this case column 2).
See ?[.

HTH.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of John Kane
Sent: donderdag 12 mei 2011 14:33
To: R R-help
Subject: [R] Simple order() data frame question.

Clearly, I don't understand what order() is doing and as ususl the help for
order seems to only confuse me more. For some reason I just don't follow the
examples there. I must be missing something about the data frame sort there
but what?

I originally wanted to  reverse-order my data frame df1 (see below) by aa (a
factor) but since this was not working I decided to simplify and order by bb
to see what was haqppening!!

I'm obviously doing something stupid but what? 

(df1 - data.frame(aa=letters[1:10],
 bb=rnorm(10)))
# Order in acending order by bb
(df1[order(df1[,2]),] ) # seems to work fine

# Order in decending order by bb.
(df1[order(df1[,-2]),])  # does not seem to work


===
 sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252
LC_MONETARY=English_Canada.1252
[4] LC_NUMERIC=CLC_TIME=English_Canada.1252

attached base packages:
 [1] grid  grDevices datasets  splines   graphics  stats tcltk
utils methods   base 

other attached packages:
[1] ggplot2_0.8.9   proto_0.3-9.2   reshape_0.8.4   plyr_1.5.2
svSocket_0.9-51 TinnR_1.0.3 R2HTML_2.2 
[8] Hmisc_3.8-3 survival_2.36-9

loaded via a namespace (and not attached):
[1] cluster_1.13.3  lattice_0.19-26 svMisc_0.9-61   tools_2.13.0   


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Binomial

2011-05-12 Thread Alexander Engelhardt

Am 12.05.2011 13:19, schrieb Sarah Sanchez:

Dear R helpers,

I am raising one query regarding this Binomial thread with the sole intention 
of learning something more as I understand R forum is an ocean of knowledge.

I was going through all the responses, but wondered that original query was 
about generating Binomial random numbers while what the R code produced so far 
generates the Bernoulli Random no.s i.e. 0 and 1.

True Binomial distribution is nothing but no of Bernoulli trials. As I said I 
am a moron and don't understand much about Statistics. Just couldn't stop from 
asking my stupid question.


Oh, yes.
You can generate one B(20,0.7)-distributed random varible by summing up 
the like this:


 pie - 0.7
 x - runif(20)
 x
 [1] 0.83108099 0.72843379 0.08862017 0.78477878 0.69230873 0.11229410
 [7] 0.64483435 0.87748373 0.17448824 0.43549622 0.30374272 0.76274317
[13] 0.34832376 0.20876835 0.85280612 0.93810355 0.65720548 0.05557451
[19] 0.88041390 0.68938009
 x - runif(20)  pie
 x
 [1] FALSE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE FALSE 
 TRUE

[13] FALSE FALSE  TRUE  TRUE FALSE  TRUE FALSE FALSE
 sum(x)
[1] 10

You could shorten this to

 sum(runif(20)0.7)
[1] 12

Which would be the same as

 rbinom(1,20,0.5)
[1] 6

or even

 qbinom(runif(1),20,0.5)
[1] 12


Just play around a little, and learn from the help files:

 ?rbinom

Have fun!



--- On Thu, 5/12/11, David Winsemiusdwinsem...@comcast.net  wrote:

From: David Winsemiusdwinsem...@comcast.net
Subject: Re: [R] Binomial
To: Alexander Engelhardta...@chaotic-neutral.de
Cc: r-help@r-project.org, blutackx-jess-...@hotmail.co.uk
Date: Thursday, May 12, 2011, 11:08 AM





I hope Allan knows this and is just being humorous here,  but for the less 
experienced in the audience ... Choosing a different threshold variable name 
might be less error prone. `pi` is one of few built-in constants in R and there 
may be code that depends on that fact.


pi

[1] 3.141593


He didn't, or better, he forgot.
Also, that Allan isn't related to me (I think) :)

 - Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple order() data frame question.

2011-05-12 Thread John Kane
Ah, this never would have occured to me.  It's rather obvious now but of 
course, I'll forget it again.  Note to self: Put it in the cribsheet.

Thanks very mcuy

--- On Thu, 5/12/11, Nick Sabbe nick.sa...@ugent.be wrote:

 From: Nick Sabbe nick.sa...@ugent.be
 Subject: RE: [R] Simple order()  data frame question.
 To: 'John Kane' jrkrid...@yahoo.ca, 'R R-help' 
 r-h...@stat.math.ethz.ch
 Received: Thursday, May 12, 2011, 8:50 AM
 Try
 (df1[order(-df1[,2]),])
 Adding the minus within the [ leaves out the column (in
 this case column 2).
 See ?[.
 
 HTH.
 
 
 Nick Sabbe
 --
 ping: nick.sa...@ugent.be
 link: http://biomath.ugent.be
 wink: A1.056, Coupure Links 653, 9000 Gent
 ring: 09/264.59.36
 
 -- Do Not Disapprove
 
 
 
 
 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of John Kane
 Sent: donderdag 12 mei 2011 14:33
 To: R R-help
 Subject: [R] Simple order() data frame question.
 
 Clearly, I don't understand what order() is doing and as
 ususl the help for
 order seems to only confuse me more. For some reason I just
 don't follow the
 examples there. I must be missing something about the data
 frame sort there
 but what?
 
 I originally wanted to  reverse-order my data frame
 df1 (see below) by aa (a
 factor) but since this was not working I decided to
 simplify and order by bb
 to see what was haqppening!!
 
 I'm obviously doing something stupid but what? 
 
 (df1 - data.frame(aa=letters[1:10],
      bb=rnorm(10)))
 # Order in acending order by bb
 (df1[order(df1[,2]),] ) # seems to work fine
 
 # Order in decending order by bb.
 (df1[order(df1[,-2]),])  # does not seem to work
 
 
 ===
  sessionInfo()
 R version 2.13.0 (2011-04-13)
 Platform: i386-pc-mingw32/i386 (32-bit)
 
 locale:
 [1] LC_COLLATE=English_Canada.1252 
 LC_CTYPE=English_Canada.1252
 LC_MONETARY=English_Canada.1252
 [4] LC_NUMERIC=C           
        
 LC_TIME=English_Canada.1252    
 
 attached base packages:
  [1] grid      grDevices datasets 
 splines   graphics  stats 
    tcltk
 utils 
    methods   base 
    
 
 other attached packages:
 [1]
 ggplot2_0.8.9   proto_0.3-9.2   reshape_0.8.4   plyr_1.5.2
 svSocket_0.9-51 TinnR_1.0.3 
    R2HTML_2.2     
 [8] Hmisc_3.8-3     survival_2.36-9
 
 loaded via a namespace (and not attached):
 [1] cluster_1.13.3  lattice_0.19-26
 svMisc_0.9-61   tools_2.13.0   
 
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple order() data frame question.

2011-05-12 Thread John Kane
Argh.  I knew it was at least partly obvious.  I never have been able to read 
the order() help page and understand what it is saying.

THanks very much.

By the way, to me it is counter-intuitive that the the command is

 df1[order(df1[,2],decreasing=TRUE),]

For some reason I keep expecting it to be 
order( , df1[,2],decreasing=TRUE)

So clearly I don't understand what is going on but at least I a lot better off. 
 I may be able to get this graph to work.  



--- On Thu, 5/12/11, Patrick Breheny patrick.breh...@uky.edu wrote:

 From: Patrick Breheny patrick.breh...@uky.edu
 Subject: Re: [R] Simple order()  data frame question.
 To: John Kane jrkrid...@yahoo.ca
 Cc: R R-help r-h...@stat.math.ethz.ch
 Received: Thursday, May 12, 2011, 8:44 AM
 On 05/12/2011 08:32 AM, John Kane
 wrote:
  Clearly, I don't understand what order() is doing and
 as ususl the help for order seems to only confuse me more.
 For some reason I just don't follow the examples there. I
 must be missing something about the data frame sort there
 but what?
 
  I originally wanted to  reverse-order my data
 frame df1 (see below) by aa (a factor) but since this was
 not working I decided to simplify and order by bb to see
 what was haqppening!!
 
  I'm obviously doing something stupid but what?
 
  (df1- data.frame(aa=letters[1:10],
        bb=rnorm(10)))
  # Order in acending order by bb
  (df1[order(df1[,2]),] ) # seems to work fine
 
  # Order in decending order by bb.
  (df1[order(df1[,-2]),])  # does not seem to work
 
 
 There is a 'decreasing' option described in the help file
 for 'order' 
 which does what you want:
 
 df1- data.frame(aa=letters[1:10],bb=rnorm(10))
 df1[order(df1[,2],decreasing=TRUE),]
 
     aa          bb
 6   f  3.16449690
 7   g  2.44362935
 8   h  0.80990322
 1   a  0.06365513
 5   e -0.33932586
 9   i -0.52119533
 2   b -0.65623164
 4   d -0.86918700
 3   c -1.86750927
 10  j -2.21178676
 
 df1[order(df1[,1],decreasing=TRUE),]
 
     aa          bb
 10  j -2.21178676
 9   i -0.52119533
 8   h  0.80990322
 7   g  2.44362935
 6   f  3.16449690
 5   e -0.33932586
 4   d -0.86918700
 3   c -1.86750927
 2   b -0.65623164
 1   a  0.06365513
 
 The expression 'df1[,-2]' removes the second column from
 df1; clearly 
 not what you want here.
 
 -- 
 Patrick Breheny
 Assistant Professor
 Department of Biostatistics
 Department of Statistics
 University of Kentucky


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple order() data frame question.

2011-05-12 Thread Marc Schwartz
On May 12, 2011, at 8:09 AM, John Kane wrote:

 Argh.  I knew it was at least partly obvious.  I never have been able to read 
 the order() help page and understand what it is saying.
 
 THanks very much.
 
 By the way, to me it is counter-intuitive that the the command is
 
 df1[order(df1[,2],decreasing=TRUE),]
 
 For some reason I keep expecting it to be 
 order( , df1[,2],decreasing=TRUE)
 
 So clearly I don't understand what is going on but at least I a lot better 
 off.  I may be able to get this graph to work.  


John,

Perhaps it may be helpful to understand that order() does not actually sort() 
the data. 

It returns a vector of indices into the data, where those indices are the 
sorted ordering of the elements in the vector, or in this case, the column.

So you want the output of order() to be used within the brackets for the row 
*indices*, to reflect the ordering of the column (or columns in the case of a 
multi-level sort) that you wish to use to sort the data frame rows.

set.seed(1)
x - sample(10)

 x
 [1]  3  4  5  7  2  8  9  6 10  1


# sort() actually returns the sorted data
 sort(x)
 [1]  1  2  3  4  5  6  7  8  9 10


# order() returns the indices of 'x' in sorted order
 order(x)
 [1] 10  5  1  2  3  8  4  6  7  9


# This does the same thing as sort()
 x[order(x)]
 [1]  1  2  3  4  5  6  7  8  9 10


set.seed(1)
df1 - data.frame(aa = letters[1:10], bb = rnorm(10))

 df1
   aa bb
1   a -0.6264538
2   b  0.1836433
3   c -0.8356286
4   d  1.5952808
5   e  0.3295078
6   f -0.8204684
7   g  0.4874291
8   h  0.7383247
9   i  0.5757814
10  j -0.3053884


# These are the indices of df1$bb in sorted order
 order(df1$bb)
 [1]  3  6  1 10  2  5  7  9  8  4


# Get df1$bb in increasing order
 df1$bb[order(df1$bb)]
 [1] -0.8356286 -0.8204684 -0.6264538 -0.3053884  0.1836433  0.3295078
 [7]  0.4874291  0.5757814  0.7383247  1.5952808


# Same thing as above
 sort(df1$bb)
 [1] -0.8356286 -0.8204684 -0.6264538 -0.3053884  0.1836433  0.3295078
 [7]  0.4874291  0.5757814  0.7383247  1.5952808


You can't use the output of sort() to sort the data frame rows, so you need to 
use order() to get the ordered indices and then use that to extract the data 
frame rows in the sort order that you desire:

 df1[order(df1$bb), ]
   aa bb
3   c -0.8356286
6   f -0.8204684
1   a -0.6264538
10  j -0.3053884
2   b  0.1836433
5   e  0.3295078
7   g  0.4874291
9   i  0.5757814
8   h  0.7383247
4   d  1.5952808


 df1[order(df1$bb, decreasing = TRUE), ]
   aa bb
4   d  1.5952808
8   h  0.7383247
9   i  0.5757814
7   g  0.4874291
5   e  0.3295078
2   b  0.1836433
10  j -0.3053884
1   a -0.6264538
6   f -0.8204684
3   c -0.8356286


Does that help?

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] changes in coxph in survival from older version?

2011-05-12 Thread Terry Therneau

On Wed, 2011-05-11 at 16:11 -0700, Shi, Tao wrote:
 Hi all,
 
 I found that the two different versions of survival packages, namely 2.36-5 
 vs. 2.36-8 or later, give different results for coxph function.  Please see 
 below and the data is attached.  The second one was done on Linux, but 
 Windows 
 gave the same results.  Could you please let me know which one I should trust?
 
 Thanks,

 In your case, neither.  Your data set has 22 events and 17 predictors;
the rule of thumb for a reliable Cox model is 10-20 events per predictor
which implies no more than 2 for your data set.  As a result, the
coefficients of your model have very wide confidence intervals, the coef
for Male for instance has se of 3.26, meaning the CI goes from 1/26 to
26 times the estimate; i.e., there is no biological meaning to the
estimate.

  Nevertheless, why did coxph give a different answer?  The later
version 2.36-9 failed to converge (20 iterations) with a final
log-likelihood of -19.94, the earlier code converges in 10 iterations to
-19.91.  In version 2.36-6 an extra check was put into the maximizer for
coxph in response to an exceptional data set which caused the routine to
fail due to overflow of the exp function; the Newton-Raphson iteration
algorithm had made a terrible guess in it's iteration path, which can
happen with all NR based search methods.  
   I put a limit on the size the linear predictor in the Cox model of
21.  The basic argument is that exp(linear-predictor) = relative risk
for a subject, and that there is not much biological meaning for risks
to be less than exp(-21) ~ 1/(population of the earh).  There is more to
the reasoning, interested parties should look at the comments in
src/coxsafe.c, a 5 line routine with 25 lines of discussion.  I will
happily accept input the best value for the constant.

   I never expected to see a data set with both convergence of the LL
and linear predictors larger than +-15.  Looking at the fit (older code)
 round(fit2$linear.predictor, 2)
 [1]   2.26   0.89   4.96 -19.09 -12.10   1.39   2.82   3.10
 [9]  18.57 -25.25  22.94   8.75   5.52 -27.64  14.88 -23.41
[17]  13.70 -28.45  -1.84  10.04  12.62   2.54   6.33  -8.76
[25]   9.68   4.39   2.92   3.51   6.02 -17.24   5.97

This says that, if the model is to be believed, you have several near
immortals in the data set. (Everyone else on earth will perish first).

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] separate date and time

2011-05-12 Thread Schatzi
I have a combined date and time. I would like to separate them out into two
columns so I can do things such as take the mean by time across all dates.

meas-runif(435)
nTime-seq(1303975800, 1304757000, 1800)
nDateT-as.POSIXct(nTime, origin=1970-01-01)
mat1-cbind(nDateT,meas)

means1- aggregate(mat1$meas, list(nDateT), mean)

This doesn't do anything as each day is different, but if I had just the
time, it would take the mean outputing 48 values (for each 30 min).

Also, sometimes there are missing meas to a specific time. Is there anyway
to copy the previous meas if one is missing?

-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/separate-date-and-time-tp3517571p3517571.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R won't start keeps crashing

2011-05-12 Thread Jonathan Daily
This is not very informative. What exactly is crashing? What is your
sessionInfo() output?

On Thu, May 12, 2011 at 7:57 AM, Bazman76 h_a_patie...@hotmail.com wrote:
 OK I did a seach for the files and got:

 .Rdata which is 206KB
 Canada.Rdata which is 3kB

 If I click on .Rdata I get the crash.

 If I click on Canada.Rdata the system starts?

Does it? How would I know?


 also they are stored in different places?


Are they?

 .Rdata is in My documents
 Canada.RData is in My Documents\vars\vars\data

 I assume I should delete .Rdata, should I also delete the other file?
 Where should these files be stored?

 I think the .RData was a result of my trying to save the workspace. Are
 there files being saved to the correct location?

 Thanks for your help


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/R-won-t-start-keeps-crashing-tp3516829p3517115.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


What happens when you try the ?load command from R ala:

load(C:/path/to/file)

-- 
===
Jon Daily
Technician
===
#!/usr/bin/env outside
# It's great, trust me.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mtext text size (cex) doesn't match plot

2011-05-12 Thread George Locke
thanks for reading the manual for me :X

2011/5/12 Prof Brian Ripley rip...@stats.ox.ac.uk:
 On Wed, 11 May 2011, George Locke wrote:

 Hi,

 I am using mtext instead of the ylab argument in some plots because i
 want to move it away from the numbers in the axis.  However, the text
 in the X axis,

 for example:
   par(mar=c(5, 5.5, 4, 2));
   plot(data, main=plot name, xlab= 'X axis', ylab=,
        font=2, cex.lab=1.5, font.lab=2, cex.main=1.8);
   mtext('Y axis', side=2, cex=1.5, line=4, font=2);

 This works fine, but if I then set

   par(mfrow=c(3,2));

 the text produced by mtext becomes much larger than the text X axis
 produced by plot, despite their having identical cex specifications.
 In this case, the words Y axis become much larger than plot name.
 Note that without par(mfrow) the size of X axis and Y axis match
 iff their cex(.lab) arguments match.

 How can I make mtext produce text that exactly matches the xlab?  In
 my limited experience fiddling around with this problem, the size of
 the mtext does not depend on par(mfrow), whereas the size of the xlab
 does, so if there were a formula that relates the actual size of text,

 Please do read the help!  ?mtext says

     cex: character expansion factor.  ‘NULL’ and ‘NA’ are equivalent
          to ‘1.0’.  This is an absolute measure, not scaled by
          ‘par(cex)’ or by setting ‘par(mfrow)’ or ‘par(mfcol)’.

 so no 'limited experience fiddling around with this problem' was needed.
  And see ?par:

     ‘cex’ A numerical value giving the amount by which plotting text
          and symbols should be magnified relative to the default.
          This starts as ‘1’ when a device is opened, and is reset when
          the layout is changed, e.g. by setting ‘mfrow’.

     ‘mfcol, mfrow’ A vector of the form ‘c(nr, nc)’.  Subsequent
          figures will be drawn in an ‘nr’-by-‘nc’ array on the device
          by _columns_ (‘mfcol’), or _rows_ (‘mfrow’), respectively.

          In a layout with exactly two rows and columns the base value
          of ‘cex’ is reduced by a factor of 0.83: if there are three
          or more of either rows or columns, the reduction factor is
          0.66.

 cex argument, and par(mfrow), then I could use that to attenuate the
 cex argument of mtext.  Any solution will do, so long as it maintains
 the relative sizes of the plot and the three text fields (main, x axis
 label, y axis label).

 library(fortunes); fortune(14) applies -- see the posting guide.

 --
 Brian D. Ripley,                  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford,             Tel:  +44 1865 272861 (self)
 1 South Parks Road,                     +44 1865 272866 (PA)
 Oxford OX1 3TG, UK                Fax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple order() data frame question.

2011-05-12 Thread Ivan Calandra
I was wondering whether it would be possible to make a method for 
data.frame with sort().
I think it would be more intuitive than using the complex construction 
of df[order(df$a),]

Is there any reason not to make it?

Ivan

Le 5/12/2011 15:40, Marc Schwartz a écrit :

On May 12, 2011, at 8:09 AM, John Kane wrote:


Argh.  I knew it was at least partly obvious.  I never have been able to read 
the order() help page and understand what it is saying.

THanks very much.

By the way, to me it is counter-intuitive that the the command is


df1[order(df1[,2],decreasing=TRUE),]

For some reason I keep expecting it to be
order( , df1[,2],decreasing=TRUE)

So clearly I don't understand what is going on but at least I a lot better off. 
 I may be able to get this graph to work.


John,

Perhaps it may be helpful to understand that order() does not actually sort() 
the data.

It returns a vector of indices into the data, where those indices are the 
sorted ordering of the elements in the vector, or in this case, the column.

So you want the output of order() to be used within the brackets for the row 
*indices*, to reflect the ordering of the column (or columns in the case of a 
multi-level sort) that you wish to use to sort the data frame rows.

set.seed(1)
x- sample(10)


x

  [1]  3  4  5  7  2  8  9  6 10  1


# sort() actually returns the sorted data

sort(x)

  [1]  1  2  3  4  5  6  7  8  9 10


# order() returns the indices of 'x' in sorted order

order(x)

  [1] 10  5  1  2  3  8  4  6  7  9


# This does the same thing as sort()

x[order(x)]

  [1]  1  2  3  4  5  6  7  8  9 10


set.seed(1)
df1- data.frame(aa = letters[1:10], bb = rnorm(10))


df1

aa bb
1   a -0.6264538
2   b  0.1836433
3   c -0.8356286
4   d  1.5952808
5   e  0.3295078
6   f -0.8204684
7   g  0.4874291
8   h  0.7383247
9   i  0.5757814
10  j -0.3053884


# These are the indices of df1$bb in sorted order

order(df1$bb)

  [1]  3  6  1 10  2  5  7  9  8  4


# Get df1$bb in increasing order

df1$bb[order(df1$bb)]

  [1] -0.8356286 -0.8204684 -0.6264538 -0.3053884  0.1836433  0.3295078
  [7]  0.4874291  0.5757814  0.7383247  1.5952808


# Same thing as above

sort(df1$bb)

  [1] -0.8356286 -0.8204684 -0.6264538 -0.3053884  0.1836433  0.3295078
  [7]  0.4874291  0.5757814  0.7383247  1.5952808


You can't use the output of sort() to sort the data frame rows, so you need to 
use order() to get the ordered indices and then use that to extract the data 
frame rows in the sort order that you desire:


df1[order(df1$bb), ]

aa bb
3   c -0.8356286
6   f -0.8204684
1   a -0.6264538
10  j -0.3053884
2   b  0.1836433
5   e  0.3295078
7   g  0.4874291
9   i  0.5757814
8   h  0.7383247
4   d  1.5952808



df1[order(df1$bb, decreasing = TRUE), ]

aa bb
4   d  1.5952808
8   h  0.7383247
9   i  0.5757814
7   g  0.4874291
5   e  0.3295078
2   b  0.1836433
10  j -0.3053884
1   a -0.6264538
6   f -0.8204684
3   c -0.8356286


Does that help?

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mtext text size (cex) doesn't match plot

2011-05-12 Thread Peter Ehlers

On 2011-05-12 07:16, George Locke wrote:

thanks for reading the manual for me :X


For a bit more reading, you could check out ?title.
You could replace your mtext() calls with

  title(ylab='Y axis', cex.lab=1.5, line=4, font.lab=2)

Peter Ehlers



2011/5/12 Prof Brian Ripleyrip...@stats.ox.ac.uk:

On Wed, 11 May 2011, George Locke wrote:


Hi,

I am using mtext instead of the ylab argument in some plots because i
want to move it away from the numbers in the axis.  However, the text
in the X axis,

for example:
   par(mar=c(5, 5.5, 4, 2));
   plot(data, main=plot name, xlab= 'X axis', ylab=,
font=2, cex.lab=1.5, font.lab=2, cex.main=1.8);
   mtext('Y axis', side=2, cex=1.5, line=4, font=2);

This works fine, but if I then set

   par(mfrow=c(3,2));

the text produced by mtext becomes much larger than the text X axis
produced by plot, despite their having identical cex specifications.
In this case, the words Y axis become much larger than plot name.
Note that without par(mfrow) the size of X axis and Y axis match
iff their cex(.lab) arguments match.

How can I make mtext produce text that exactly matches the xlab?  In
my limited experience fiddling around with this problem, the size of
the mtext does not depend on par(mfrow), whereas the size of the xlab
does, so if there were a formula that relates the actual size of text,


Please do read the help!  ?mtext says

 cex: character expansion factor.  ‘NULL’ and ‘NA’ are equivalent
  to ‘1.0’.  This is an absolute measure, not scaled by
  ‘par(cex)’ or by setting ‘par(mfrow)’ or ‘par(mfcol)’.

so no 'limited experience fiddling around with this problem' was needed.
  And see ?par:

 ‘cex’ A numerical value giving the amount by which plotting text
  and symbols should be magnified relative to the default.
  This starts as ‘1’ when a device is opened, and is reset when
  the layout is changed, e.g. by setting ‘mfrow’.

 ‘mfcol, mfrow’ A vector of the form ‘c(nr, nc)’.  Subsequent
  figures will be drawn in an ‘nr’-by-‘nc’ array on the device
  by _columns_ (‘mfcol’), or _rows_ (‘mfrow’), respectively.

  In a layout with exactly two rows and columns the base value
  of ‘cex’ is reduced by a factor of 0.83: if there are three
  or more of either rows or columns, the reduction factor is
  0.66.


cex argument, and par(mfrow), then I could use that to attenuate the
cex argument of mtext.  Any solution will do, so long as it maintains
the relative sizes of the plot and the three text fields (main, x axis
label, y axis label).


library(fortunes); fortune(14) applies -- see the posting guide.

--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm and anova

2011-05-12 Thread Ista Zahn
Hi Sara,
As the help page for anova.lm says,

Specifying a single object gives a sequential analysis of variance table.

That most likely also the answer to your second question.

The anova function can be used to compare nested models, and this
provides the flexibility to test arbitrary hypotheses, including all
the ones given by different types of ANOVA tables. You may also find
the Anova() function in the car package helpful.

Best,
Ista
On Thu, May 12, 2011 at 2:37 AM, Sara Sjöstedt de Luna
sara.de.l...@math.umu.se wrote:
 Hi!

 We have run a linear regression model with 3 explanatory variables and get 
 the output below.
 Does anyone know what type of test the anova model below does and why we get 
 so different result in terms of significant variables by the two tables?

 Thanks!

 /Sara

 summary(model)
 Call:
 lm(formula = log(HOBU) ~ Vole1 + Volelag + Year)
 Residuals:
      Min        1Q    Median        3Q       Max
 -0.757284 -0.166681  0.009478  0.181304  0.692916
 Coefficients:
             Estimate Std. Error t value Pr(|t|)
 (Intercept) 80.041737  12.018726   6.660 1.40e-07 ***
 Vole1        0.005521   0.041626   0.133   0.8953
 Volelag      0.033966   0.018392   1.847   0.0738 .
 Year        -0.035927   0.006027  -5.961 1.08e-06 ***

 anova(model)
 Analysis of Variance Table
 Response: log(HOBU)
          Df Sum Sq Mean Sq F value    Pr(F)
 Vole1      1 1.7877  1.7877 13.1772 0.0009486 ***
 Volelag    1 0.5817  0.5817  4.2878 0.0462831 *
 Year       1 4.8205  4.8205 35.5323 1.082e-06 ***
 Residuals 33 4.4769  0.1357


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm and anova

2011-05-12 Thread Paul Chatfield
anova uses sequential sums of squares (type 1), summary adjusted sums of
squares (type 3)

Take for example the first line of each output.  In summary this tests
whether vole1 is needed ASSUMING volelag and year are already in the model
(conclusion would then be: it isn't needed p=.89).  Whereas in anova, it's
testing do we need vole1 assuming nothing else is in the model (conclusion:
vole1 is better than nothing. p=.0009).

anova assumes all terms above it are in the model but terms below it are not
so volelag assumes vole1 is in the model but not year.  You can see how
anova changes but summary doesn't by varying the order you put them in.

so the final model I would fit here would probably end up being either
year+volelag or just year,

HTH,

Paul

--
View this message in context: 
http://r.789695.n4.nabble.com/lm-and-anova-tp3516748p3517356.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Binomial

2011-05-12 Thread Sarah Sanchez
Thanks a lot sir.

Regards

Sarah

--- On Thu, 5/12/11, Alexander Engelhardt a...@chaotic-neutral.de wrote:

From: Alexander Engelhardt a...@chaotic-neutral.de
Subject: Re: [R] Binomial
To: Sarah Sanchez sarah_sanche...@yahoo.com
Cc: David Winsemius dwinsem...@comcast.net, r-help@r-project.org
Date: Thursday, May 12, 2011, 12:53 PM

Am 12.05.2011 13:19, schrieb Sarah Sanchez:
 Dear R helpers,
 
 I am raising one query regarding this Binomial thread with the sole 
 intention of learning something more as I understand R forum is an ocean of 
 knowledge.
 
 I was going through all the responses, but wondered that original query was 
 about generating Binomial random numbers while what the R code produced so 
 far generates the Bernoulli Random no.s i.e. 0 and 1.
 
 True Binomial distribution is nothing but no of Bernoulli trials. As I said I 
 am a moron and don't understand much about Statistics. Just couldn't stop 
 from asking my stupid question.

Oh, yes.
You can generate one B(20,0.7)-distributed random varible by summing up the 
like this:

 pie - 0.7
 x - runif(20)
 x
 [1] 0.83108099 0.72843379 0.08862017 0.78477878 0.69230873 0.11229410
 [7] 0.64483435 0.87748373 0.17448824 0.43549622 0.30374272 0.76274317
[13] 0.34832376 0.20876835 0.85280612 0.93810355 0.65720548 0.05557451
[19] 0.88041390 0.68938009
 x - runif(20)  pie
 x
 [1] FALSE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE
[13] FALSE FALSE  TRUE  TRUE FALSE  TRUE FALSE FALSE
 sum(x)
[1] 10

You could shorten this to

 sum(runif(20)0.7)
[1] 12

Which would be the same as

 rbinom(1,20,0.5)
[1] 6

or even

 qbinom(runif(1),20,0.5)
[1] 12


Just play around a little, and learn from the help files:

 ?rbinom

Have fun!

 
 --- On Thu, 5/12/11, David Winsemiusdwinsem...@comcast.net  wrote:
 
 From: David Winsemiusdwinsem...@comcast.net
 Subject: Re: [R] Binomial
 To: Alexander Engelhardta...@chaotic-neutral.de
 Cc: r-help@r-project.org, blutackx-jess-...@hotmail.co.uk
 Date: Thursday, May 12, 2011, 11:08 AM
 

 
 I hope Allan knows this and is just being humorous here,  but for the less 
 experienced in the audience ... Choosing a different threshold variable name 
 might be less error prone. `pi` is one of few built-in constants in R and 
 there may be code that depends on that fact.
 
 pi
 [1] 3.141593

He didn't, or better, he forgot.
Also, that Allan isn't related to me (I think) :)

 - Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Asking Favor For the Script of Median Filter

2011-05-12 Thread robleaf
Here is one I wrote for the raster package. It searches a raster layer for
NA's and takes the median of the number of non NA adjacent cells determined
by neighbor count. You could turn your matrix into a raster to make it work
or change the code.

Hope you find it useful, Robert

neighbor.filter - function(raster.layer,neighbor.count = 3) {

require(raster)

base.rast - raster.layer
count - 1
NA.ind - which(is.na(base.rast[]))
median.vals - matrix(NA,length(NA.ind),3)
for (j in 1:length(NA.ind)) {

row.ind.NA - rowFromCell(base.rast, NA.ind[j])
col.ind.NA - colFromCell(base.rast, NA.ind[j])

row.ind - c(row.ind.NA-1,row.ind.NA,row.ind.NA+1)
col.ind - c(col.ind.NA-1,col.ind.NA,col.ind.NA+1)

row.ind.check - expand.grid(row.ind,col.ind)[,1]
col.ind.check - expand.grid(row.ind,col.ind)[,2]

ind.del.1 - c(which(row.ind.check  dim(base.rast)[1]),which(row.ind.check
 1))
if (length(ind.del.1)  0) {
row.ind.check - row.ind.check[-ind.del.1]
col.ind.check - col.ind.check[-ind.del.1]  }

ind.del.2 - c(which(col.ind.check  1),which(col.ind.check 
dim(base.rast)[2]))
if (length(ind.del.2)  0) {
row.ind.check - row.ind.check[-ind.del.2]
col.ind.check - col.ind.check[-ind.del.2]  }

if (length(which(base.rast[cellFromRowCol(base.rast, row.ind.check,
col.ind.check)]  0)) = neighbor.count) {

median.vals[count,c(1:3)] - c(NA.ind[j],
   median(base.rast[cellFromRowCol(base.rast,
row.ind.check, col.ind.check)], na.rm = T),
   length(which(base.rast[cellFromRowCol(base.rast,
row.ind.check, col.ind.check)]  0)))
count - count + 1
}
}

median.vals - median.vals[which(median.vals[,1]  0),]
base.rast[median.vals[,1]] - median.vals[,2]

return(base.rast)  }


Robert Leaf, PhD
NOAA Narragansett Laboratory

--
View this message in context: 
http://r.789695.n4.nabble.com/Asking-Favor-For-the-Script-of-Median-Filter-tp3409462p3517365.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simple 95% confidence interval for a median

2011-05-12 Thread Georgina Imberger
Hi!

I have a data set of 86 values that are non-normally distributed (counts).

The median value is 10. I want to get an estimate of the  95% confidence
interval for this median value.

I tried to use a one-sample Wiolcoxin test:

wilcox.test(Comps,mu=10,conf.int=TRUE)

and got the following output:

 Wilcoxon signed rank test with continuity correction

data:  Comps
V = 2111, p-value = 0.05846
alternative hypothesis: true location is not equal to 10
95 percent confidence interval:
 10.0 17.49993
sample estimates:
(pseudo)median
  12.50006

I wonder if someone would mind helping me out?

What am I doing wrong?
What is the '(psuedo)median'?
Can I get R to estimate the confidence around the actual median of 10?

With thanks,
Georgie

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2011-05-12 Thread Fabian
#subject: type III sum of squares - anova() Anova() AnovaM()
#R-version: 2.12.2

#Hello everyone,

#I am currently evaluating experimental data of a  two factor 
experiment. to illustrate de my problem I will use following #dummy 
dataset: Factor T1 has 3 levels (A,B,C) and factor T2 has 2 
levels E and F. The design is #completly balanced, each factor 
combinations has 4 replicates.

#the dataset looks like this:

T1-(c(rep(c(A,B,C),each=8)))
T2-(c(rep(rep(c(E,F),each=4),3)))
RESPONSE-c(1,2,3,2,2,1,3,2,9,8,8,9,6,5,5,6,5,5,5,6,1,2,3,3)
  DF-as.data.frame(cbind(T1,T2,RESPONSE))
DF$RESPONSE-as.numeric(DF$RESPONSE)

  DF
T1 T2 RESPONSE
1   A  E1
2   A  E2
3   A  E3
4   A  E2
5   A  F2
6   A  F1
7   A  F3
8   A  F2
9   B  E7
10  B  E6
11  B  E6
12  B  E7
13  B  F5
14  B  F4
15  B  F4
16  B  F5
17  C  E4
18  C  E4
19  C  E4
20  C  E5
21  C  F1
22  C  F2
23  C  F3
24  C  F3

library(biology)
replications(RESPONSE ~ T1*T2,data=DF)
T1T2 T1:T2
 812 4
  is.balanced(RESPONSE ~ T1*T2,data=DF)
[1] TRUE


#Now I would like to know whether T1, T2 or T1*T2 have a significant 
effect on RESPONSE. As far as I know, the #theory says that I should use 
a type III sum of squares, but the theory also says that if the design 
is completely #balanced, there is no difference between type I,II or III 
sum of squares.

#so I first fit a linear model:

my.anov-lm(RESPONSE~T1+T2+T1:T2)

#then I do a normal Anova

  anova(my.anov)

Analysis of Variance Table

Response: RESPONSE
   Df Sum Sq Mean Sq F valuePr(F)
T1 2  103.0  51.500  97.579 2.183e-10 ***
T2 1   24.0  24.000  45.474 2.550e-06 ***
T1:T2  2   12.0   6.000  11.368  0.000642 ***
Residuals 189.5   0.528

#When I do the same with the Anova() function from the car package I 
get the same result

Anova(my.anov)

Anova Table (Type II tests)

Response: RESPONSE
   Sum Sq Df F valuePr(F)
T1 103.0  2  97.579 2.183e-10 ***
T2  24.0  1  45.474 2.550e-06 ***
T1:T2   12.0  2  11.368  0.000642 ***
Residuals9.5 18

#(type two sees to be the default and type=I produces an error (why?))

#yet, when I specify type=III it gives me something completely different:

Anova(my.anov,type=III)
Anova Table (Type III tests)

Response: RESPONSE
 Sum Sq Df F valuePr(F)
(Intercept)   16.0  1  30.316 3.148e-05 ***
T184.5  2  80.053 1.100e-09 ***
T2 0.0  1   0.000  1.00
T1:T2 12.0  2  11.368  0.000642 ***
Residuals  9.5 18

#an the AnovaM() function from the biology package does the same for 
type I and II and produces the following #result:

library(biology)
  AnovaM(my.anov,type=III)
 Df Sum Sq Mean Sq F value   Pr(F)
T1   2   84.5  42.250  80.053 1.10e-09 ***
T2   1   24.0  24.000  45.474 2.55e-06 ***
T1:T22   12.0   6.000  11.368 0.000642 ***
Residuals   189.5   0.528

#Is type 3 the Type I should use and why do the results differ if the 
design is balanced? I am really confused, it would #be great if someone 
could help me out!

#Thanks a lot for your help!

#/Fabian
#University of Gothenburg



















[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Scale time series in a way that 90% of the data is in the -0.-9/ +0.9 range

2011-05-12 Thread Mr.Q
Hello,

How can i scale my time series in a way that 90% of the data is in the
-0.-9/ +0.9 range?

My approach is to first build a clean vector without those 10% far
away from the mean

require(outliers)

y-rep(c(1,1,1,1,1,9),10)
yc-y

ycc-length(y)*0.1

for(j in 1:ycc)
{
cat(Remove,j)
yc-rm.outlier(yc)
}

and then do my scaling based on the cleaned data

for(k in 1:length(y)) {
y[k]-(((y[k]-min(yc))/(max(yc)-min(yc)))*1.8)-0.9
}

This works fine for the first three loops, but then strangely crashes :
(

Remove 1Remove 2Remove 3Error in if (xor(((max(x, na.rm = TRUE) -
mean(x, na.rm = TRUE))  (mean(x,  :
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In max(x, na.rm = TRUE) :
  no non-missing arguments to max; returning -Inf
2: In min(x, na.rm = TRUE) :
  no non-missing arguments to min; returning Inf

Any ideas for me?

Thanks in advance,

Mr. Q

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package update

2011-05-12 Thread jjallaire
To run RStudio as root on Ubuntu you would just do: sudo rstudio

The packages in /usr/lib/R/library are the ones that came with the base
install of R. 

J.J. Allaire


--
View this message in context: 
http://r.789695.n4.nabble.com/package-update-tp3507479p3517539.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about glmnet

2011-05-12 Thread David Katz
I believe you can in this sense: use model.matrix to create X for
glmnet(X,y,...).

However, when dropping variables this will drop the indicators individually,
not per factor, which may not be what you are looking for.

Good luck,
David Katz


Axel Urbiz wrote:
 
 Hi,
 
 Is it possible to include factor variables as model inputs using this
 package? I'm quite sure it is not possible, but would like to double
 check.
 
 Thanks,
 
 Axel.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


--
View this message in context: 
http://r.789695.n4.nabble.com/Question-about-glmnet-tp3006439p3517635.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R won't start keeps crashing

2011-05-12 Thread Bazman76
http://r.789695.n4.nabble.com/file/n3517669/R_crash.jpg 

OK when I click on the .RData file I get the screen above. Also when I start
R from the desktop icon or from select if from programs I get the same
result.

The warning message is in focus and I can not move the focus to the GUI.
When I either click to dismiss the message of click to close it the whole
session shuts down.

So I can not enter the commands you sugested into this corrupt version.


However, when I click on the Canada.Rdata the R session starts and seems to
function normally.

Here are the results of the functions you requested. Not sure if they will
help given that this version is workfing?

 sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252 
[2] LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C   
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base   





 load(C:/path/to/file) 
Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
In addition: Warning message:
In readChar(con, 5L, useBytes = TRUE) :
  cannot open compressed file 'C:/path/to/file', probable reason 'No such
file or directory'
 
 

Sorry if I gave you the wrong information previously, just let me know what
you want.

--
View this message in context: 
http://r.789695.n4.nabble.com/R-won-t-start-keeps-crashing-tp3516829p3517669.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Errors and line numbers in scripts?

2011-05-12 Thread Elliot Joel Bernstein
Is it possible to get R to report the line number of an error when a script
is called with source()? I found the following post from 2009, but it's not
clear to me if this ever made it into the release version:

ws wrote:
* Is there a way to have R return the line number in a script when it errors 
out?
**
** I call my script like:
**
** $ R --vanilla  script.R  output.txt
**
** I seem to remember a long discussion about this at some point, but I can't
** remember the outcome.
**
**
*
The current development version returns much more information about
error locations.  I don't know if it will handle this case (R doesn't
get told the filename in case of input redirection, for example).
Generally the goal is to report error lines during interactive use,
batch use is assumed to already be debugged.

Duncan Murdoch


Thanks.

- Elliot

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2011-05-12 Thread Ista Zahn
Hi Fabian,
You my find my discussion of types of SS helpful. My website has
been down for some time, but you can retrieve it from
http://psychology.okstate.edu/faculty/jgrice/psyc5314/SS_types.pdf
among other places.

Best,
Ista

On Thu, May 12, 2011 at 10:33 AM, Fabian fabian_ro...@gmx.de wrote:
 #subject: type III sum of squares - anova() Anova() AnovaM()
 #R-version: 2.12.2

 #Hello everyone,

 #I am currently evaluating experimental data of a  two factor
 experiment. to illustrate de my problem I will use following #dummy
 dataset: Factor T1 has 3 levels (A,B,C) and factor T2 has 2
 levels E and F. The design is #completly balanced, each factor
 combinations has 4 replicates.

 #the dataset looks like this:

 T1-(c(rep(c(A,B,C),each=8)))
 T2-(c(rep(rep(c(E,F),each=4),3)))
 RESPONSE-c(1,2,3,2,2,1,3,2,9,8,8,9,6,5,5,6,5,5,5,6,1,2,3,3)
  DF-as.data.frame(cbind(T1,T2,RESPONSE))
 DF$RESPONSE-as.numeric(DF$RESPONSE)

   DF
    T1 T2 RESPONSE
 1   A  E        1
 2   A  E        2
 3   A  E        3
 4   A  E        2
 5   A  F        2
 6   A  F        1
 7   A  F        3
 8   A  F        2
 9   B  E        7
 10  B  E        6
 11  B  E        6
 12  B  E        7
 13  B  F        5
 14  B  F        4
 15  B  F        4
 16  B  F        5
 17  C  E        4
 18  C  E        4
 19  C  E        4
 20  C  E        5
 21  C  F        1
 22  C  F        2
 23  C  F        3
 24  C  F        3

 library(biology)
 replications(RESPONSE ~ T1*T2,data=DF)
    T1    T2 T1:T2
     8    12     4
  is.balanced(RESPONSE ~ T1*T2,data=DF)
 [1] TRUE


 #Now I would like to know whether T1, T2 or T1*T2 have a significant
 effect on RESPONSE. As far as I know, the #theory says that I should use
 a type III sum of squares, but the theory also says that if the design
 is completely #balanced, there is no difference between type I,II or III
 sum of squares.

 #so I first fit a linear model:

 my.anov-lm(RESPONSE~T1+T2+T1:T2)

 #then I do a normal Anova

   anova(my.anov)

 Analysis of Variance Table

 Response: RESPONSE
           Df Sum Sq Mean Sq F value    Pr(F)
 T1         2  103.0  51.500  97.579 2.183e-10 ***
 T2         1   24.0  24.000  45.474 2.550e-06 ***
 T1:T2      2   12.0   6.000  11.368  0.000642 ***
 Residuals 18    9.5   0.528

 #When I do the same with the Anova() function from the car package I
 get the same result

 Anova(my.anov)

 Anova Table (Type II tests)

 Response: RESPONSE
           Sum Sq Df F value    Pr(F)
 T1         103.0  2  97.579 2.183e-10 ***
 T2          24.0  1  45.474 2.550e-06 ***
 T1:T2       12.0  2  11.368  0.000642 ***
 Residuals    9.5 18

 #(type two sees to be the default and type=I produces an error (why?))

 #yet, when I specify type=III it gives me something completely different:

 Anova(my.anov,type=III)
 Anova Table (Type III tests)

 Response: RESPONSE
             Sum Sq Df F value    Pr(F)
 (Intercept)   16.0  1  30.316 3.148e-05 ***
 T1            84.5  2  80.053 1.100e-09 ***
 T2             0.0  1   0.000  1.00
 T1:T2         12.0  2  11.368  0.000642 ***
 Residuals      9.5 18

 #an the AnovaM() function from the biology package does the same for
 type I and II and produces the following #result:

 library(biology)
  AnovaM(my.anov,type=III)
             Df Sum Sq Mean Sq F value   Pr(F)
 T1           2   84.5  42.250  80.053 1.10e-09 ***
 T2           1   24.0  24.000  45.474 2.55e-06 ***
 T1:T2        2   12.0   6.000  11.368 0.000642 ***
 Residuals   18    9.5   0.528

 #Is type 3 the Type I should use and why do the results differ if the
 design is balanced? I am really confused, it would #be great if someone
 could help me out!

 #Thanks a lot for your help!

 #/Fabian
 #University of Gothenburg



















        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Errors and line numbers in scripts?

2011-05-12 Thread Duncan Murdoch

On 12/05/2011 11:02 AM, Elliot Joel Bernstein wrote:

Is it possible to get R to report the line number of an error when a script
is called with source()? I found the following post from 2009, but it's not
clear to me if this ever made it into the release version:


It does so for parse errors.  It doesn't record lines of statements at 
the top level of a script for run-time errors, but if you define a 
function in your script, and call it at a different line, and an error 
occurs internally, then R will know which line of the function triggered 
the error.  If you use options(error=recover) you'll get a display 
something like this:


 source(c:/temp/test.R)
Error in f(0) : an error in f

Enter a frame number, or 0 to exit

1: source(c:/temp/test.R)
2: eval.with.vis(ei, envir)
3: eval.with.vis(expr, envir, enclos)
4: g()
5: test.R#6: f(0)

The call to g() was in the test.R script but was not recorded.  However, 
the function g contains a call to f(0), and that one was recorded as 
being at line 6 of the file.


Duncan Murdoch

Here's the test.R file I used:

--
f - function(x) {
  stop(an error in f)
}

g - function() {
  f(0)
}

g()


ws wrote:
* Is there a way to have R return the line number in a script when it errors 
out?
**
** I call my script like:
**
** $ R --vanilla  script.R  output.txt
**
** I seem to remember a long discussion about this at some point, but I can't
** remember the outcome.
**
**
*
The current development version returns much more information about
error locations.  I don't know if it will handle this case (R doesn't
get told the filename in case of input redirection, for example).
Generally the goal is to report error lines during interactive use,
batch use is assumed to already be debugged.

Duncan Murdoch


Thanks.

- Elliot

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple order() data frame question.

2011-05-12 Thread Matthew Dowle

With data.table, the following is routine :

DT[order(a)]   # ascending
DT[order(-a)]  # descending, if a is numeric
DT[a5,sum(z),by=c][order(-V1)]   # sum of z group by c, just where a5, 
then show me the largest first
DT[order(-a,b)]  # order by a descending then by b ascending, if a and b are 
both numeric

It avoids peppering your code with $, and becomes quite natural after a 
short while; especially compound queries such as the 3rd example.

Matthew

http://datatable.r-forge.r-project.org/


Ivan Calandra ivan.calan...@uni-hamburg.de wrote in message 
news:4dcbec8b.6040...@uni-hamburg.de...
I was wondering whether it would be possible to make a method for
data.frame with sort().
I think it would be more intuitive than using the complex construction
of df[order(df$a),]
Is there any reason not to make it?

Ivan

Le 5/12/2011 15:40, Marc Schwartz a écrit :
 On May 12, 2011, at 8:09 AM, John Kane wrote:

 Argh.  I knew it was at least partly obvious.  I never have been able to 
 read the order() help page and understand what it is saying.

 THanks very much.

 By the way, to me it is counter-intuitive that the the command is

 df1[order(df1[,2],decreasing=TRUE),]
 For some reason I keep expecting it to be
 order( , df1[,2],decreasing=TRUE)

 So clearly I don't understand what is going on but at least I a lot 
 better off.  I may be able to get this graph to work.

 John,

 Perhaps it may be helpful to understand that order() does not actually 
 sort() the data.

 It returns a vector of indices into the data, where those indices are the 
 sorted ordering of the elements in the vector, or in this case, the 
 column.

 So you want the output of order() to be used within the brackets for the 
 row *indices*, to reflect the ordering of the column (or columns in the 
 case of a multi-level sort) that you wish to use to sort the data frame 
 rows.

 set.seed(1)
 x- sample(10)

 x
   [1]  3  4  5  7  2  8  9  6 10  1


 # sort() actually returns the sorted data
 sort(x)
   [1]  1  2  3  4  5  6  7  8  9 10


 # order() returns the indices of 'x' in sorted order
 order(x)
   [1] 10  5  1  2  3  8  4  6  7  9


 # This does the same thing as sort()
 x[order(x)]
   [1]  1  2  3  4  5  6  7  8  9 10


 set.seed(1)
 df1- data.frame(aa = letters[1:10], bb = rnorm(10))

 df1
 aa bb
 1   a -0.6264538
 2   b  0.1836433
 3   c -0.8356286
 4   d  1.5952808
 5   e  0.3295078
 6   f -0.8204684
 7   g  0.4874291
 8   h  0.7383247
 9   i  0.5757814
 10  j -0.3053884


 # These are the indices of df1$bb in sorted order
 order(df1$bb)
   [1]  3  6  1 10  2  5  7  9  8  4


 # Get df1$bb in increasing order
 df1$bb[order(df1$bb)]
   [1] -0.8356286 -0.8204684 -0.6264538 -0.3053884  0.1836433  0.3295078
   [7]  0.4874291  0.5757814  0.7383247  1.5952808


 # Same thing as above
 sort(df1$bb)
   [1] -0.8356286 -0.8204684 -0.6264538 -0.3053884  0.1836433  0.3295078
   [7]  0.4874291  0.5757814  0.7383247  1.5952808


 You can't use the output of sort() to sort the data frame rows, so you 
 need to use order() to get the ordered indices and then use that to 
 extract the data frame rows in the sort order that you desire:

 df1[order(df1$bb), ]
 aa bb
 3   c -0.8356286
 6   f -0.8204684
 1   a -0.6264538
 10  j -0.3053884
 2   b  0.1836433
 5   e  0.3295078
 7   g  0.4874291
 9   i  0.5757814
 8   h  0.7383247
 4   d  1.5952808


 df1[order(df1$bb, decreasing = TRUE), ]
 aa bb
 4   d  1.5952808
 8   h  0.7383247
 9   i  0.5757814
 7   g  0.4874291
 5   e  0.3295078
 2   b  0.1836433
 10  j -0.3053884
 1   a -0.6264538
 6   f -0.8204684
 3   c -0.8356286


 Does that help?

 Regards,

 Marc Schwartz

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] separate date and time

2011-05-12 Thread KEn
Schatzi adele_thompson at cargill.com writes:

 
 I have a combined date and time. I would like to separate them out into two
 columns so I can do things such as take the mean by time across all dates.
 
 meas-runif(435)
 nTime-seq(1303975800, 1304757000, 1800)
 nDateT-as.POSIXct(nTime, origin=1970-01-01)
 mat1-cbind(nDateT,meas)
 
 means1- aggregate(mat1$meas, list(nDateT), mean)
 
 This doesn't do anything as each day is different, but if I had just the
 time, it would take the mean outputing 48 values (for each 30 min).
 
 Also, sometimes there are missing meas to a specific time. Is there anyway
 to copy the previous meas if one is missing?
 
 -
 In theory, practice and theory are the same. In practice, they are not -
Albert Einstein
 --
 View this message in context:
http://r.789695.n4.nabble.com/separate-date-and-time-tp3517571p3517571.html
 Sent from the R help mailing list archive at Nabble.com.
 
 

Not sure if this is what you want, but you can use substr to split nDateT into
date and time, and then use aggregate() in the time column in df1.

meas-runif(435)
nTime-seq(1303975800, 1304757000, 1800)
nDateT-as.POSIXct(nTime, origin=1970-01-01)
date - substr(nDateT, 1, 10)
time - substr(nDateT, 12, 19)
df1 - data.frame(date, time, meas)

means1- aggregate(df1$meas, list(df1$time), mean)

HTH,
Ken

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple order() data frame question.

2011-05-12 Thread John Kane
Thanks Matthew,
I had data.table installed but totally forgot about it.

I've only used it once or twice and, IIRC, that was last year. I remember 
thinking at the time that it was a very handy package but lack of need for this 
sort of thinglet me forget it.



--- On Thu, 5/12/11, Matthew Dowle mdo...@mdowle.plus.com wrote:

 From: Matthew Dowle mdo...@mdowle.plus.com
 Subject: Re: [R] Simple order()  data frame question.
 To: r-h...@stat.math.ethz.ch
 Received: Thursday, May 12, 2011, 11:23 AM
 
 With data.table, the following is routine :
 
 DT[order(a)]   # ascending
 DT[order(-a)]  # descending, if a is numeric
 DT[a5,sum(z),by=c][order(-V1)]   # sum
 of z group by c, just where a5, 
 then show me the largest first
 DT[order(-a,b)]  # order by a descending then by b
 ascending, if a and b are 
 both numeric
 
 It avoids peppering your code with $, and becomes quite
 natural after a 
 short while; especially compound queries such as the 3rd
 example.
 
 Matthew
 
 http://datatable.r-forge.r-project.org/
 
 
 Ivan Calandra ivan.calan...@uni-hamburg.de
 wrote in message 
 news:4dcbec8b.6040...@uni-hamburg.de...
 I was wondering whether it would be possible to make a
 method for
 data.frame with sort().
 I think it would be more intuitive than using the complex
 construction
 of df[order(df$a),]
 Is there any reason not to make it?
 
 Ivan
 
 Le 5/12/2011 15:40, Marc Schwartz a écrit :
  On May 12, 2011, at 8:09 AM, John Kane wrote:
 
  Argh.  I knew it was at least partly
 obvious.  I never have been able to 
  read the order() help page and understand what it
 is saying.
 
  THanks very much.
 
  By the way, to me it is counter-intuitive that the
 the command is
 
  df1[order(df1[,2],decreasing=TRUE),]
  For some reason I keep expecting it to be
  order( , df1[,2],decreasing=TRUE)
 
  So clearly I don't understand what is going on but
 at least I a lot 
  better off.  I may be able to get this graph
 to work.
 
  John,
 
  Perhaps it may be helpful to understand that order()
 does not actually 
  sort() the data.
 
  It returns a vector of indices into the data, where
 those indices are the 
  sorted ordering of the elements in the vector, or in
 this case, the 
  column.
 
  So you want the output of order() to be used within
 the brackets for the 
  row *indices*, to reflect the ordering of the column
 (or columns in the 
  case of a multi-level sort) that you wish to use to
 sort the data frame 
  rows.
 
  set.seed(1)
  x- sample(10)
 
  x
    [1]  3  4  5 
 7  2  8  9  6 10  1
 
 
  # sort() actually returns the sorted data
  sort(x)
    [1]  1  2  3 
 4  5  6  7  8  9 10
 
 
  # order() returns the indices of 'x' in sorted order
  order(x)
    [1] 10  5  1  2 
 3  8  4  6  7  9
 
 
  # This does the same thing as sort()
  x[order(x)]
    [1]  1  2  3 
 4  5  6  7  8  9 10
 
 
  set.seed(1)
  df1- data.frame(aa = letters[1:10], bb =
 rnorm(10))
 
  df1
      aa     
    bb
  1   a -0.6264538
  2   b  0.1836433
  3   c -0.8356286
  4   d  1.5952808
  5   e  0.3295078
  6   f -0.8204684
  7   g  0.4874291
  8   h  0.7383247
  9   i  0.5757814
  10  j -0.3053884
 
 
  # These are the indices of df1$bb in sorted order
  order(df1$bb)
    [1]  3  6  1 10 
 2  5  7  9  8  4
 
 
  # Get df1$bb in increasing order
  df1$bb[order(df1$bb)]
    [1] -0.8356286 -0.8204684 -0.6264538
 -0.3053884  0.1836433  0.3295078
    [7]  0.4874291 
 0.5757814  0.7383247  1.5952808
 
 
  # Same thing as above
  sort(df1$bb)
    [1] -0.8356286 -0.8204684 -0.6264538
 -0.3053884  0.1836433  0.3295078
    [7]  0.4874291 
 0.5757814  0.7383247  1.5952808
 
 
  You can't use the output of sort() to sort the data
 frame rows, so you 
  need to use order() to get the ordered indices and
 then use that to 
  extract the data frame rows in the sort order that you
 desire:
 
  df1[order(df1$bb), ]
      aa     
    bb
  3   c -0.8356286
  6   f -0.8204684
  1   a -0.6264538
  10  j -0.3053884
  2   b  0.1836433
  5   e  0.3295078
  7   g  0.4874291
  9   i  0.5757814
  8   h  0.7383247
  4   d  1.5952808
 
 
  df1[order(df1$bb, decreasing = TRUE), ]
      aa     
    bb
  4   d  1.5952808
  8   h  0.7383247
  9   i  0.5757814
  7   g  0.4874291
  5   e  0.3295078
  2   b  0.1836433
  10  j -0.3053884
  1   a -0.6264538
  6   f -0.8204684
  3   c -0.8356286
 
 
  Does that help?
 
  Regards,
 
  Marc Schwartz
 
  __
  R-help@r-project.org
 mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
 
 
 -- 
 Ivan CALANDRA
 PhD Student
 University of Hamburg
 Biozentrum Grindel und Zoologisches Museum
 Abt. Säugetiere
 Martin-Luther-King-Platz 3
 D-20146 Hamburg, GERMANY
 +49(0)40 42838 6231
 ivan.calan...@uni-hamburg.de
 
 **
 http://www.for771.uni-bonn.de
 http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php
 
 
 -Inline 

Re: [R] R won't start keeps crashing

2011-05-12 Thread Jonathan Daily
I have some suggestions inline below. My biggest suggestion would be
to read the help files that came with R, especially the section
Invoking R in An Introduction to R.

On Thu, May 12, 2011 at 10:24 AM, Bazman76 h_a_patie...@hotmail.com wrote:
 http://r.789695.n4.nabble.com/file/n3517669/R_crash.jpg

 OK when I click on the .RData file I get the screen above. Also when I start
 R from the desktop icon or from select if from programs I get the same
 result.

This may mean that you answered yes to the exit prompt asking you to
save your workspace, especially if you didn't attempt to save a file
.Rdata


 The warning message is in focus and I can not move the focus to the GUI.
 When I either click to dismiss the message of click to close it the whole
 session shuts down.

 So I can not enter the commands you sugested into this corrupt version.


 However, when I click on the Canada.Rdata the R session starts and seems to
 function normally.

 Here are the results of the functions you requested. Not sure if they will
 help given that this version is workfing?

 sessionInfo()
 R version 2.13.0 (2011-04-13)
 Platform: i386-pc-mingw32/i386 (32-bit)

 locale:
 [1] LC_COLLATE=English_United Kingdom.1252
 [2] LC_CTYPE=English_United Kingdom.1252
 [3] LC_MONETARY=English_United Kingdom.1252
 [4] LC_NUMERIC=C
 [5] LC_TIME=English_United Kingdom.1252

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base





 load(C:/path/to/file)
 Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
 In addition: Warning message:
 In readChar(con, 5L, useBytes = TRUE) :
  cannot open compressed file 'C:/path/to/file', probable reason 'No such
 file or directory'

Replace path/to/file with the path to your file.




 Sorry if I gave you the wrong information previously, just let me know what
 you want.

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/R-won-t-start-keeps-crashing-tp3516829p3517669.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
===
Jon Daily
Technician
===
#!/usr/bin/env outside
# It's great, trust me.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R won't start keeps crashing

2011-05-12 Thread David Winsemius


On May 12, 2011, at 10:24 AM, Bazman76 wrote:


http://r.789695.n4.nabble.com/file/n3517669/R_crash.jpg

OK when I click on the .RData file I get the screen above. Also when  
I start

R from the desktop icon or from select if from programs I get the same
result.


Why do you even have this file around anymore? You have been told that  
it is corrupt and that trashing it is the way to go forward.


(I suppose you could the error message regarding the missing `vars`  
package and see if that restores civil order, but if .Rdata doesn't   
have useful information that would be difficult to rebuild then   
just trash it.)


--
David.



The warning message is in focus and I can not move the focus to the  
GUI.
When I either click to dismiss the message of click to close it the  
whole

session shuts down.

So I can not enter the commands you sugested into this corrupt  
version.



However, when I click on the Canada.Rdata the R session starts and  
seems to

function normally.

Here are the results of the functions you requested. Not sure if  
they will

help given that this version is workfing?


sessionInfo()

R version 2.13.0 (2011-04-13)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base



load(C:/path/to/file)
Error in readChar(con, 5L, useBytes = TRUE) : cannot open the  
connection

In addition: Warning message:
In readChar(con, 5L, useBytes = TRUE) :
 cannot open compressed file 'C:/path/to/file', probable reason 'No  
such

file or directory'





Sorry if I gave you the wrong information previously, just let me  
know what

you want.





David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] separate date and time

2011-05-12 Thread Schatzi
That is wonderful. Thank you.
Adele


Ken Takagi wrote:
 
 Schatzi adele_thompson at cargill.com writes:
 
 
 I have a combined date and time. I would like to separate them out into
 two
 columns so I can do things such as take the mean by time across all
 dates.
 
 meas-runif(435)
 nTime-seq(1303975800, 1304757000, 1800)
 nDateT-as.POSIXct(nTime, origin=1970-01-01)
 mat1-cbind(nDateT,meas)
 
 means1- aggregate(mat1$meas, list(nDateT), mean)
 
 This doesn't do anything as each day is different, but if I had just the
 time, it would take the mean outputing 48 values (for each 30 min).
 
 Also, sometimes there are missing meas to a specific time. Is there
 anyway
 to copy the previous meas if one is missing?
 
 -
 In theory, practice and theory are the same. In practice, they are not -
 Albert Einstein
 --
 View this message in context:
 http://r.789695.n4.nabble.com/separate-date-and-time-tp3517571p3517571.html
 Sent from the R help mailing list archive at Nabble.com.
 
 
 
 Not sure if this is what you want, but you can use substr to split nDateT
 into
 date and time, and then use aggregate() in the time column in df1.
 
 meas-runif(435)
 nTime-seq(1303975800, 1304757000, 1800)
 nDateT-as.POSIXct(nTime, origin=1970-01-01)
 date - substr(nDateT, 1, 10)
 time - substr(nDateT, 12, 19)
 df1 - data.frame(date, time, meas)
 
 means1- aggregate(df1$meas, list(df1$time), mean)
 
 HTH,
 Ken
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/separate-date-and-time-tp3517571p3517999.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem converting character to dates

2011-05-12 Thread Assu
Hi all,

I've searched this problem and still I can't understand my results, so here
goes:

I have some time series I've imported from excel with the dates in text
format from excel. Data was imported with RODBC, sqlQuery() function.

I have these dates: 
adates
 [1] 01/2008 02/2008 03/2008 04/2008 05/2008 06/2008 07/2008
  [8] 08/2008 09/2008 10/2008 11/2008 12/2008 13/2008 14/2008
I want the format week/year, so I do:
as.Date(adates,format=c(%W/%y))
and get 
 [1] 2020-05-12 2020-05-12 2020-05-12 2020-05-12 2020-05-12
  [6] 2020-05-12 2020-05-12 2020-05-12 2020-05-12 2020-05-12
everything is equal to this: 2020-this month-today

if I use  strptime(dates, %W/%y)
it's the same.

Can you explain why this happens and how to solve it?
Thanks in advance
Assu

--
View this message in context: 
http://r.789695.n4.nabble.com/problem-converting-character-to-dates-tp3517918p3517918.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] log transformation and mean question

2011-05-12 Thread 1Rnwb
I have question about log2 transformation and performing mean on log2 data. I
am doing analysis for ELISA data.  the OD values and the concentration
values for the standards were log2 transformed before performing the lm. the
OD values for samples were log2 transformed and coefficients of lm were
applied to get the log2 concentration values. I then backtransformed these
log2 concentrations and the trouble started. when i take the mean of log2
concentrations the value is different than the backtransformed
concentrations.

 100+1000/2
[1] 600

 2^((log2(100)+log2(1000))/2)
[1] 316.2278

What I am doing wrong to get the different values

--
View this message in context: 
http://r.789695.n4.nabble.com/log-transformation-and-mean-question-tp3517825p3517825.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with mediation

2011-05-12 Thread Mervi Virtanen




Hello!

I have problem with mediation analysis. I can do it with function
mediate, when I have one mediator. But how I can do it if I have one
independent variable and one dependent variable but 4 mediators? I
have try function mediations, but it dosen't work. If I use mediate 4  
times, each for every mediator, is it same? I want to know what is  
total mediate effect for 4 mediators.


t.Mete

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Change font size in Windows

2011-05-12 Thread John Kane
My day for dumb questions. How do I increase the type size in the Rgui console 
in Windows? (R-2.13.0, Windows 7)

It looked to me that I just needed to change the font spec in Rconsole but that 
does not seem to be working.

The R FAQ for Windows has a reference in Q3.4 to changing fonts, (Q5.2), but I 
don't see anything relevant there. 

Rconsole originally was:
font = TT Courier New
points = 10
style = normal # Style can be normal, bold, italic

I changed this to points=14 and then in desperation to points=18 with no effect.

I have both restarted R and done a complete reboot to no avail.  What am I 
missing?

Any advice would be most gratefully received
===
 sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252
LC_MONETARY=English_Canada.1252
[4] LC_NUMERIC=CLC_TIME=English_Canada.1252

attached base packages:
[1] grDevices datasets  splines   graphics  stats tcltk utils 
methods   base 

other attached packages:
[1] svSocket_0.9-51 TinnR_1.0.3 R2HTML_2.2  Hmisc_3.8-3 
survival_2.36-9

loaded via a namespace (and not attached):
[1] cluster_1.13.3  grid_2.13.0 lattice_0.19-26 svMisc_0.9-61   tools_2.13.0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm and anova

2011-05-12 Thread peter dalgaard

On May 12, 2011, at 15:30 , Paul Chatfield wrote:

 anova uses sequential sums of squares (type 1),

Yes.

 summary adjusted sums of
 squares (type 3)

No. Type III SS is a considerably stranger beast. 

summary() looks at the s.e. of individual coefficients. For 1 DF effects, this 
is often equivalent to Type II tests (not III, except when they happens to be 
equal), except when looking at main effect terms in the presence of 
interactions (in which case things get parametrization-dependent.)

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Saving misclassified records into dataframe within a loop

2011-05-12 Thread John Dennison
Greetings R world,

I know some version of the this question has been asked before, but i need
to save the output of a loop into a data frame to eventually be written to a
postgres data base with dbWriteTable. Some background. I have developed
classifications models to help identify problem accounts. The logic is this,
if the model classifies the record as including variable X and it turns out
that record does not have X then it should be reviewed(ie i need the row
number/ID saved to a database). Generally i want to look at the
misclassified records. This is a little hack i know, anyone got a better
idea please let me know. Here is an example

library(rpart)

# grow tree
fit - rpart(Kyphosis ~ Age + Number + Start,
   method=class, data=kyphosis)
#predict
prediction-predict(fit, kyphosis)

#misclassification index function

predict.function - function(x){
for (i in 1:length(kyphosis$Kyphosis)) {
#the idea is that if the record is absent but the prediction is otherwise
then show me that record
 if (((kyphosis$Kyphosis[i]==absent)==(prediction[i,1]==1)) == 0 ){
   #THIS WORKS
print( row.names(kyphosis[c(i),]))
}
} }

predict.function(x)

Now my issue is that i want to save these id to a data.frame so i can later
save them to a database. This this an incorrect approach. Can I save each id
to the postgres instance as it is found. i have a ignorant fear of lapply,
but it seems it may hold the key.


Ive tried

predict.function - function(x){
results-as.data.frame(1)
for (i in 1:length(kyphosis$Kyphosis)) {
#the idea is that if the record is absent but the prediction is otherwise
then show me that record
 if (((kyphosis$Kyphosis[i]==absent)==(prediction[i,1]==1)) == 0 ){
   #THIS WORKS
results[i,]- as.data.frame(row.names(kyphosis[c(i),]))
}
} }

this does not work. results object does not get saved. Any Help would be
greatly appreciated.


Thanks

John Dennison

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple 95% confidence interval for a median

2011-05-12 Thread Greg Snow
Contrary to the commonly held assumption, the Wilcoxin test does not deal with 
medians in general.

There are some specific cases/assumptions where the test/interval would apply 
to the median, if I remember correctly the assumptions include that the 
population distribution is symmetric and the only alternatives considered are 
shifts of the distribution (both assumptions that go contrary to what I would 
believe in most situations where I would want to use the Wilcoxin test).

If you want an actual confidence interval on the true meadian, then you either 
need to make some assumptions about the distribution that the data comes from, 
or use a tool like the bootstrap.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Georgina Imberger
 Sent: Thursday, May 12, 2011 7:36 AM
 To: r-help@r-project.org
 Subject: [R] Simple 95% confidence interval for a median
 
 Hi!
 
 I have a data set of 86 values that are non-normally distributed
 (counts).
 
 The median value is 10. I want to get an estimate of the  95%
 confidence
 interval for this median value.
 
 I tried to use a one-sample Wiolcoxin test:
 
 wilcox.test(Comps,mu=10,conf.int=TRUE)
 
 and got the following output:
 
  Wilcoxon signed rank test with continuity correction
 
 data:  Comps
 V = 2111, p-value = 0.05846
 alternative hypothesis: true location is not equal to 10
 95 percent confidence interval:
  10.0 17.49993
 sample estimates:
 (pseudo)median
   12.50006
 
 I wonder if someone would mind helping me out?
 
 What am I doing wrong?
 What is the '(psuedo)median'?
 Can I get R to estimate the confidence around the actual median of 10?
 
 With thanks,
 Georgie
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] log transformation and mean question

2011-05-12 Thread Ted Harding
On 12-May-11 15:15:00, 1Rnwb wrote:
 I have question about log2 transformation and performing mean
 on log2 data. I am doing analysis for ELISA data. the OD values
 and the concentration values for the standards were log2
 transformed before performing the lm. the OD values for samples
 were log2 transformed and coefficients of lm were applied to get
 the log2 concentration values. I then backtransformed these
 log2 concentrations and the trouble started. when i take the
 mean of log2 concentrations the value is different than the
 backtransformed concentrations.
 
 100+1000/2
 [1] 600
 
 2^( ( log2(100)+log2(1000) )/2 )
 [1] 316.2278
 
 What I am doing wrong to get the different values

Apart from the fact that I think your first line should be

  (100+1000)/2
  # [1] 550

you are doing nothing whatever wrong! The difference is an
inevitable result of the fact that, for any set of positive
numbers X = c(x1,x2,...,xn), not all equal,

  mean(log(X))  log(mean(X))

This is because the curve of y = log(x) lies below the
tangent to the curve at any given point. If that point is
mean(X), and the tangent is y = a + b*x, then

  mean(log(X))  mean(a + b*X) = a + b*mean(X) = log(mean(X))

since y = a + b*x is tangent to y = log(x) at x = mean(X).
This is a special case of a general result called Jensen's
Inequality.

Your second line is

  2^mean(log2(X))  2^log2(mean(X)) = mean(X).

where X = c(100,1000).

Ted.


E-Mail: (Ted Harding) ted.hard...@wlandres.net
Fax-to-email: +44 (0)870 094 0861
Date: 12-May-11   Time: 17:37:45
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] group length

2011-05-12 Thread Asan Ramzan
Hi
 
I have four groups
 
y1=c(1.214,1.180,1.199)
y2=c(1.614,1.710,1.867,1.479)
y3=c(1.361,1.270,1.375,1.299)
y4=c(1.459,1.335)
Is there a function that can give me the length for each, like the made up 
example below?
 
function(length(y1:y2)
[1] 3 4 4 2
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group length

2011-05-12 Thread Scott Chamberlain
require(plyr)
laply(list(y1, y2, y3, y4), length)

Scott 
On Thursday, May 12, 2011 at 11:50 AM, Asan Ramzan wrote:
Hi
 
 I have four groups
 
 y1=c(1.214,1.180,1.199)
 y2=c(1.614,1.710,1.867,1.479)
 y3=c(1.361,1.270,1.375,1.299)
 y4=c(1.459,1.335)
 Is there a function that can give me the length for each, like the made up 
 example below?
 
  function(length(y1:y2)
 [1] 3 4 4 2
  [[alternative HTML version deleted]]
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group length

2011-05-12 Thread Ledon, Alain
sapply...

 y1=c(1.214,1.180,1.199)
 y2=c(1.614,1.710,1.867,1.479)
 y3=c(1.361,1.270,1.375,1.299)
 y4=c(1.459,1.335)
 sapply(list(y1,y2,y3,y4), length)
[1] 3 4 4 2


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Asan Ramzan
Sent: Thursday, May 12, 2011 12:50 PM
To: r-help@r-project.org
Subject: [R] group length

Hi
 
I have four groups
 
y1=c(1.214,1.180,1.199)
y2=c(1.614,1.710,1.867,1.479)
y3=c(1.361,1.270,1.375,1.299)
y4=c(1.459,1.335)
Is there a function that can give me the length for each, like the made up 
example below?
 
function(length(y1:y2)
[1] 3 4 4 2
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R won't start keeps crashing

2011-05-12 Thread Bazman76
I will read the intor to R.

When closing I got a pop up asking me if I wanted to save the workspace I
just clicked yes?

here is what I got

load(C:/Documents and Settings/Hugh/My Documents/vars/vars/data)
Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
In addition: Warning message:
In readChar(con, 5L, useBytes = TRUE) :
  cannot open compressed file 'C:/Documents and Settings/Hugh/My
Documents/vars/vars/data', probable reason 'Permission denied'


--
View this message in context: 
http://r.789695.n4.nabble.com/R-won-t-start-keeps-crashing-tp3516829p3518035.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R won't start keeps crashing

2011-05-12 Thread Bazman76
will delete it, just wanted to try and sort out the bug

--
View this message in context: 
http://r.789695.n4.nabble.com/R-won-t-start-keeps-crashing-tp3516829p3518036.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Survival Rate Estimates

2011-05-12 Thread Brian McLoone
Dear List,

Is there an automated way to use the survival package to generate survival
rate estimates and their standard errors?  To be clear, *not *the
survivorship estimates (which are cumulative), but the survival *rate *
estimates...

Thank you in advance for any help.

Best,
Brian

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Change font size in Windows

2011-05-12 Thread Greg Snow
One simple way:

Run R (the gui version)
Click on the Edit menu
Click on the GUI Preferences item.

Select the font, size, style, colors, etc. that you want. If you click on Save 
then these become the new default.  If you click on Apply, but don't save then 
they will last that session but not be the new defaults.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of John Kane
 Sent: Thursday, May 12, 2011 10:25 AM
 To: R R-help
 Subject: [R] Change font size in Windows
 
 My day for dumb questions. How do I increase the type size in the Rgui
 console in Windows? (R-2.13.0, Windows 7)
 
 It looked to me that I just needed to change the font spec in Rconsole
 but that does not seem to be working.
 
 The R FAQ for Windows has a reference in Q3.4 to changing fonts,
 (Q5.2), but I don't see anything relevant there.
 
 Rconsole originally was:
 font = TT Courier New
 points = 10
 style = normal # Style can be normal, bold, italic
 
 I changed this to points=14 and then in desperation to points=18 with
 no effect.
 
 I have both restarted R and done a complete reboot to no avail.  What
 am I missing?
 
 Any advice would be most gratefully received
 ===
  sessionInfo()
 R version 2.13.0 (2011-04-13)
 Platform: i386-pc-mingw32/i386 (32-bit)
 
 locale:
 [1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252
 LC_MONETARY=English_Canada.1252
 [4] LC_NUMERIC=CLC_TIME=English_Canada.1252
 
 attached base packages:
 [1] grDevices datasets  splines   graphics  stats tcltk utils
 methods   base
 
 other attached packages:
 [1] svSocket_0.9-51 TinnR_1.0.3 R2HTML_2.2  Hmisc_3.8-3
 survival_2.36-9
 
 loaded via a namespace (and not attached):
 [1] cluster_1.13.3  grid_2.13.0 lattice_0.19-26 svMisc_0.9-61
 tools_2.13.0
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Change font size in Windows

2011-05-12 Thread Uwe Ligges



On 12.05.2011 18:25, John Kane wrote:

My day for dumb questions. How do I increase the type size in the Rgui console 
in Windows? (R-2.13.0, Windows 7)

It looked to me that I just needed to change the font spec in Rconsole but that 
does not seem to be working.

The R FAQ for Windows has a reference in Q3.4 to changing fonts, (Q5.2), but I 
don't see anything relevant there.

Rconsole originally was:
font = TT Courier New
points = 10
style = normal # Style can be normal, bold, italic

I changed this to points=14 and then in desperation to points=18 with no effect.

I have both restarted R and done a complete reboot to no avail.  What am I 
missing?


You probably edited *one* of at least two Rconsole files. The one in 
your home directory (probably C:/Users/Username/Documents/Rconsole) 
takes precedence.


Uwe Ligges




Any advice would be most gratefully received
===

sessionInfo()

R version 2.13.0 (2011-04-13)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252
LC_MONETARY=English_Canada.1252
[4] LC_NUMERIC=CLC_TIME=English_Canada.1252

attached base packages:
[1] grDevices datasets  splines   graphics  stats tcltk utils 
methods   base

other attached packages:
[1] svSocket_0.9-51 TinnR_1.0.3 R2HTML_2.2  Hmisc_3.8-3 
survival_2.36-9

loaded via a namespace (and not attached):
[1] cluster_1.13.3  grid_2.13.0 lattice_0.19-26 svMisc_0.9-61   tools_2.13.0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package update

2011-05-12 Thread Dirk Eddelbuettel

On 9 May 2011 at 12:57, Uwe Ligges wrote:
| 
| 
| On 08.05.2011 19:54, eric wrote:
|  I tried to update my packages using update.packages()
| 
|  I got the following message:
| 
|  The downloaded packages are in
|  ‘/tmp/RtmpyDYdTX/downloaded_packages’
|  Warning in install.packages(update[instlib == l, Package], l, contriburl =
|  contriburl,  :
| 'lib = /usr/lib/R/library' is not writable
|  Error in install.packages(update[instlib == l, Package], l, contriburl =
|  contriburl,  :
| unable to install package
| 
|  How do I fix this ?
| 
| If you want to update packages in R's default library in 
| /usr/lib/R/library, you will need root permissions.

The way it is meant to work is that you (eric, the user) become a member of
the group owning that directory -- and I picked group 'staff' for that. 

In the postinst (of the Debian / Ubuntu R packages), this directory is created 
as

# edd 03 Apr 2003  cf Section 10.1.2 of Debian Policy
if [ ! -e /usr/local/lib/R ]; then
  if mkdir /usr/local/lib/R 2/dev/null; then
chown root:staff /usr/local/lib/R
chmod 2775 /usr/local/lib/R
  fi
fi
if [ ! -e /usr/local/lib/R/site-library ]; then
  if mkdir /usr/local/lib/R/site-library 2/dev/null; then
chown root:staff /usr/local/lib/R/site-library
chmod 2775 /usr/local/lib/R/site-library
  fi
fi

We could conceivably be fancier and create an R group on the system, but I
felt this is best left to local admins. 

Alternatively, if you make that directory owned by 'you' then you don't need
root either.

You can check what group you are part of via 'id'. On my Ubuntu box, I am
member of a few groups:

edd@max:~$ id
uid=1000(edd) gid=1000(edd) 
groups=1000(edd),4(adm),20(dialout),24(cdrom),27(sudo),44(video),46(plugdev),50(staff),107(lpadmin),115(admin),122(sambashare),124(libvirtd)
edd@max:~$ 

Hope this helps,  Dirk 

 
| Uwe Ligges
| 
| 
| 
| 
|  --
|  View this message in context: 
http://r.789695.n4.nabble.com/package-update-tp3507479p3507479.html
|  Sent from the R help mailing list archive at Nabble.com.
| 
|  __
|  R-help@r-project.org mailing list
|  https://stat.ethz.ch/mailman/listinfo/r-help
|  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
|  and provide commented, minimal, self-contained, reproducible code.
| 
| __
| R-help@r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-help
| PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
| and provide commented, minimal, self-contained, reproducible code.

-- 
Gauss once played himself in a zero-sum game and won $50.
  -- #11 at http://www.gaussfacts.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem converting character to dates

2011-05-12 Thread Uwe Ligges



On 12.05.2011 17:40, Assu wrote:

Hi all,

I've searched this problem and still I can't understand my results, so here
goes:

I have some time series I've imported from excel with the dates in text
format from excel. Data was imported with RODBC, sqlQuery() function.

I have these dates:

adates

  [1] 01/2008 02/2008 03/2008 04/2008 05/2008 06/2008 07/2008
   [8] 08/2008 09/2008 10/2008 11/2008 12/2008 13/2008 14/2008
I want the format week/year, so I do:

as.Date(adates,format=c(%W/%y))

and get
  [1] 2020-05-12 2020-05-12 2020-05-12 2020-05-12 2020-05-12
   [6] 2020-05-12 2020-05-12 2020-05-12 2020-05-12 2020-05-12
everything is equal to this: 2020-this month-today

if I use  strptime(dates, %W/%y)



1. you need an upper case Y
2. you need a weekday (otherwise the result is not well defined and you 
get today as the default)


hence

strptime(paste(0, adates, sep=/), %w/%W/%Y)

should work.

Uwe Ligges





it's the same.

Can you explain why this happens and how to solve it?
Thanks in advance
Assu

--
View this message in context: 
http://r.789695.n4.nabble.com/problem-converting-character-to-dates-tp3517918p3517918.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] confint.multinom() slow?

2011-05-12 Thread Hans Ekbrand
Dear R-helpers,

I'm doing a bivariate analysis with two factors, both with relatively
many levels:

1. clustering, a factor with 35 levels
2. country, a factor with 24 levels

n = 12,855

my.fit - multinom(clustering ~ country, maxit=300)
converges after 280 iterations.

I would like to get CI:s for the odds ratios, and have tried confint()

my.cis - confint(my.fit)

I started confint() a few hours ago, but now I'm getting suspicious,
since it hasn't terminated yet. Perhaps I just lack the reasonable
patience, but is such a long computational time for confint() to be
expected here?

Hans Ekbrand



signature.asc
Description: OpenPGP digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group length

2011-05-12 Thread Jeffrey Dick
On Thu, May 12, 2011 at 10:00 AM, Ledon, Alain alain.le...@ally.com wrote:
 sapply...

 y1=c(1.214,1.180,1.199)
 y2=c(1.614,1.710,1.867,1.479)
 y3=c(1.361,1.270,1.375,1.299)
 y4=c(1.459,1.335)
 sapply(list(y1,y2,y3,y4), length)
 [1] 3 4 4 2


Or, if you don't want to name each object individually:

 sapply(mget(paste(y,1:4,sep=),sys.frame()),length)
y1 y2 y3 y4
 3  4  4  2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Change font size in Windows

2011-05-12 Thread John Kane
Definitely.  I edited the one in the program files.  

I think i saw a ref to the home file but it did not sink in.

Thanks

--- On Thu, 5/12/11, Uwe Ligges lig...@statistik.tu-dortmund.de wrote:

 From: Uwe Ligges lig...@statistik.tu-dortmund.de
 Subject: Re: [R] Change font size in Windows
 To: John Kane jrkrid...@yahoo.ca
 Cc: R R-help r-h...@stat.math.ethz.ch
 Received: Thursday, May 12, 2011, 1:09 PM
 
 
 On 12.05.2011 18:25, John Kane wrote:
  My day for dumb questions. How do I increase the type
 size in the Rgui console in Windows? (R-2.13.0, Windows 7)
 
  It looked to me that I just needed to change the font
 spec in Rconsole but that does not seem to be working.
 
  The R FAQ for Windows has a reference in Q3.4 to
 changing fonts, (Q5.2), but I don't see anything relevant
 there.
 
  Rconsole originally was:
  font = TT Courier New
  points = 10
  style = normal # Style can be normal, bold, italic
 
  I changed this to points=14 and then in desperation to
 points=18 with no effect.
 
  I have both restarted R and done a complete reboot to
 no avail.  What am I missing?
 
 You probably edited *one* of at least two Rconsole files.
 The one in 
 your home directory (probably
 C:/Users/Username/Documents/Rconsole) 
 takes precedence.
 
 Uwe Ligges
 
 
 
  Any advice would be most gratefully received
 
 ===
  sessionInfo()
  R version 2.13.0 (2011-04-13)
  Platform: i386-pc-mingw32/i386 (32-bit)
 
  locale:
  [1] LC_COLLATE=English_Canada.1252 
 LC_CTYPE=English_Canada.1252   
 LC_MONETARY=English_Canada.1252
  [4] LC_NUMERIC=C         
          
 LC_TIME=English_Canada.1252
 
  attached base packages:
  [1] grDevices datasets 
 splines   graphics  stats 
    tcltk     utils 
    methods   base
 
  other attached packages:
  [1] svSocket_0.9-51 TinnR_1.0.3 
    R2HTML_2.2     
 Hmisc_3.8-3     survival_2.36-9
 
  loaded via a namespace (and not attached):
  [1] cluster_1.13.3  grid_2.13.0 
    lattice_0.19-26
 svMisc_0.9-61   tools_2.13.0
 



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple 95% confidence interval for a median

2011-05-12 Thread peter dalgaard

On May 12, 2011, at 18:33 , Greg Snow wrote:

 Contrary to the commonly held assumption, the Wilcoxin test does not deal 
 with medians in general.
 
 There are some specific cases/assumptions where the test/interval would apply 
 to the median, if I remember correctly the assumptions include that the 
 population distribution is symmetric and the only alternatives considered are 
 shifts of the distribution (both assumptions that go contrary to what I would 
 believe in most situations where I would want to use the Wilcoxin test).

Yes. Notice that the signed-rank Wilcoxon test does in fact assume symmetry 
under the null hypothesis, which does makes sense when looking at differences, 
but less so away from the null. 

As far as I remember, the pseudo median minimizes the absolute value of the 
signed-rank test statistic, but to be sure, read the reference on the help page.

 
 If you want an actual confidence interval on the true meadian, then you 
 either need to make some assumptions about the distribution that the data 
 comes from, or use a tool like the bootstrap.

You can invert the binomial. Since 95 percent of the binomial distribution with 
p=.5, n=86 is between 35 and 52 you can generate a 95% CI for the median as 
sort(x)[c(34,53)].

There are a few demons lurking in the details, and it is easy be off-by-one, 
but you get the picture.

Try this

ci - replicate(5000, {x-rexp(86); sort(x)[c(34,53)] })
m - qexp(.5)
ci - ci[,order(apply(ci,2,sum))]
matplot(t(ci),pch=.)
abline(h=m)
sum(ci[1,]m)
sum(ci[2,]m)

(I get about 2% error in either direction, so slightly conservative. Taking 
c(35,52), I get 3% both ways, so I suppose I got the cutoff right. A bit 
earlier in the day and I might even be able to prove it...) 

BTW, I'm sure someone has improved on this with some sort of interpolation. 



 
 -- 
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111
 
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Georgina Imberger
 Sent: Thursday, May 12, 2011 7:36 AM
 To: r-help@r-project.org
 Subject: [R] Simple 95% confidence interval for a median
 
 Hi!
 
 I have a data set of 86 values that are non-normally distributed
 (counts).
 
 The median value is 10. I want to get an estimate of the  95%
 confidence
 interval for this median value.
 
 I tried to use a one-sample Wiolcoxin test:
 
 wilcox.test(Comps,mu=10,conf.int=TRUE)
 
 and got the following output:
 
 Wilcoxon signed rank test with continuity correction
 
 data:  Comps
 V = 2111, p-value = 0.05846
 alternative hypothesis: true location is not equal to 10
 95 percent confidence interval:
 10.0 17.49993
 sample estimates:
 (pseudo)median
  12.50006
 
 I wonder if someone would mind helping me out?
 
 What am I doing wrong?
 What is the '(psuedo)median'?
 Can I get R to estimate the confidence around the actual median of 10?
 
 With thanks,
 Georgie
 
  [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Saving misclassified records into dataframe within a loop

2011-05-12 Thread Phil Spector

John -
   In your example, the misclassified observations (as defined by
your predict.function) will be

  kyphosis[kyphosis$Kyphosis == 'absent'  prediction[,1] != 1,]

so you could start from there.
- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Thu, 12 May 2011, John Dennison wrote:


Greetings R world,

I know some version of the this question has been asked before, but i need
to save the output of a loop into a data frame to eventually be written to a
postgres data base with dbWriteTable. Some background. I have developed
classifications models to help identify problem accounts. The logic is this,
if the model classifies the record as including variable X and it turns out
that record does not have X then it should be reviewed(ie i need the row
number/ID saved to a database). Generally i want to look at the
misclassified records. This is a little hack i know, anyone got a better
idea please let me know. Here is an example

library(rpart)

# grow tree
fit - rpart(Kyphosis ~ Age + Number + Start,
  method=class, data=kyphosis)
#predict
prediction-predict(fit, kyphosis)

#misclassification index function

predict.function - function(x){
for (i in 1:length(kyphosis$Kyphosis)) {
#the idea is that if the record is absent but the prediction is otherwise
then show me that record
if (((kyphosis$Kyphosis[i]==absent)==(prediction[i,1]==1)) == 0 ){
  #THIS WORKS
print( row.names(kyphosis[c(i),]))
}
} }

predict.function(x)

Now my issue is that i want to save these id to a data.frame so i can later
save them to a database. This this an incorrect approach. Can I save each id
to the postgres instance as it is found. i have a ignorant fear of lapply,
but it seems it may hold the key.


Ive tried

predict.function - function(x){
results-as.data.frame(1)
for (i in 1:length(kyphosis$Kyphosis)) {
#the idea is that if the record is absent but the prediction is otherwise
then show me that record
if (((kyphosis$Kyphosis[i]==absent)==(prediction[i,1]==1)) == 0 ){
  #THIS WORKS
results[i,]- as.data.frame(row.names(kyphosis[c(i),]))
}
} }

this does not work. results object does not get saved. Any Help would be
greatly appreciated.


Thanks

John Dennison

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Change font size in Windows

2011-05-12 Thread John Kane
I looked at that yesterday and totally missed the font settings! I'm blaming 
the new glasses.  

Thank you. I'll probably do the Rconsole change but I's nice to know about this 
one if I'm using R on another machine.  That way I cannot mess up someone 
else's setup./

--- On Thu, 5/12/11, Greg Snow greg.s...@imail.org wrote:

 From: Greg Snow greg.s...@imail.org
 Subject: RE: [R] Change font size in Windows
 To: John Kane jrkrid...@yahoo.ca, R R-help r-h...@stat.math.ethz.ch
 Received: Thursday, May 12, 2011, 1:07 PM
 One simple way:
 
 Run R (the gui version)
 Click on the Edit menu
 Click on the GUI Preferences item.
 
 Select the font, size, style, colors, etc. that you want.
 If you click on Save then these become the new
 default.  If you click on Apply, but don't save then
 they will last that session but not be the new defaults.
 
 -- 
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111
 
 
  -Original Message-
  From: r-help-boun...@r-project.org
 [mailto:r-help-bounces@r-
  project.org] On Behalf Of John Kane
  Sent: Thursday, May 12, 2011 10:25 AM
  To: R R-help
  Subject: [R] Change font size in Windows
  
  My day for dumb questions. How do I increase the type
 size in the Rgui
  console in Windows? (R-2.13.0, Windows 7)
  
  It looked to me that I just needed to change the font
 spec in Rconsole
  but that does not seem to be working.
  
  The R FAQ for Windows has a reference in Q3.4 to
 changing fonts,
  (Q5.2), but I don't see anything relevant there.
  
  Rconsole originally was:
  font = TT Courier New
  points = 10
  style = normal # Style can be normal, bold, italic
  
  I changed this to points=14 and then in desperation to
 points=18 with
  no effect.
  
  I have both restarted R and done a complete reboot to
 no avail.  What
  am I missing?
  
  Any advice would be most gratefully received
 
 ===
   sessionInfo()
  R version 2.13.0 (2011-04-13)
  Platform: i386-pc-mingw32/i386 (32-bit)
  
  locale:
  [1] LC_COLLATE=English_Canada.1252 
 LC_CTYPE=English_Canada.1252
  LC_MONETARY=English_Canada.1252
  [4] LC_NUMERIC=C         
          
 LC_TIME=English_Canada.1252
  
  attached base packages:
  [1] grDevices datasets 
 splines   graphics  stats 
    tcltk     utils
  methods   base
  
  other attached packages:
  [1] svSocket_0.9-51 TinnR_1.0.3 
    R2HTML_2.2     
 Hmisc_3.8-3
  survival_2.36-9
  
  loaded via a namespace (and not attached):
  [1] cluster_1.13.3  grid_2.13.0 
    lattice_0.19-26 svMisc_0.9-61
  tools_2.13.0
  
  __
  R-help@r-project.org
 mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] do.call and applying na.rm=TRUE

2011-05-12 Thread John Kerpel
Hi all!  I need to do something really simple using do.call.

If I want to call the mean function inside do.call, how do I apply the
condition na.rm=TRUE?

So, I use do.call(mean, list(x)) where x is my data.  This works fine if
there are no NAs.

Thanks,

John

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] do.call and applying na.rm=TRUE

2011-05-12 Thread Jonathan Daily
?do.call

Second argument is a list of arguments to pass. Try do.call(mean,
list(x, na.rm = T))

On Thu, May 12, 2011 at 1:57 PM, John Kerpel john.ker...@gmail.com wrote:
 Hi all!  I need to do something really simple using do.call.

 If I want to call the mean function inside do.call, how do I apply the
 condition na.rm=TRUE?

 So, I use do.call(mean, list(x)) where x is my data.  This works fine if
 there are no NAs.

 Thanks,

 John

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
===
Jon Daily
Technician
===
#!/usr/bin/env outside
# It's great, trust me.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival Rate Estimates

2011-05-12 Thread David Winsemius


On May 12, 2011, at 12:40 PM, Brian McLoone wrote:


Dear List,

Is there an automated way to use the survival package to generate  
survival

rate estimates and their standard errors?  To be clear, *not *the
survivorship estimates (which are cumulative), but the survival  
*rate *

estimates...


Not entirely clear, but from context I suspect you mean instantaneous  
hazard?


(Survival is not a rate but rather a proportion. Mortality can be a  
rate. The instantaneous hazard is the decrement in survival per unit  
time divided by the survival to that time.)


So at each death the non-parametric estimate would divide current  
deaths (often 1 but ties are possible) by time since last death and  
then divide by proportion surviving.


Or if you have a semi-parametric estimated function for survival (such  
as might be output from `basehaz` which calls `survfit`) take:


 -delta_survival/delta_time/survival

tdata - data.frame(time =c(1,1,1,2,2,2,3,3,3,4,4,4),  
status=rep(c(1,0,2),4), n =c(12,3,2,6,2,4,2,0,2,3,3,5))
fit - survfit(Surv(time, time, status, type='interval') ~1,  
data=tdata, weight=n)

 T - c(0, fit$time)
 S - c(1, fit$surv)
 (-diff(S)/diff(T) )/fit$surv
[1] 0.8602308 0.8247746 0.4044324 1.2115931

I don't know if Therneau's opinion about estimating smoothed hazards  
has changed:

http://finzi.psych.upenn.edu/Rhelp10/2009-March/193104.html
There is also a muhaz package which may generate standard errors for  
its estimates but I have read elsewhere that is does not do Cox models.

http://finzi.psych.upenn.edu/R/library/muhaz/html/00Index.html
--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] new package 'mvmeta' to perform multivariate meta-analysis

2011-05-12 Thread Antonio.Gasparrini
Dear R Community,

I am pleased to announce the release of a new package called  'mvmeta', now 
available on CRAN (version 0.2.0).
 
The package mvmeta provides some functions to perform fixed and random-effects 
multivariate meta-analysis and meta-regression. This modelling framework is 
exploited to pool multiple correlated outcomes across studies, and already 
applied in different fields: meta-analysis of randomized controlled trials 
reporting more than 1 outcome, multi-site observational studies estimating 
multi-parameterized associations, among others.
 
The package is fully documented through help pages. A package vignette will be 
hopefully added soon.
 
I hope that this package will be useful to your work. 
Any kind of feedback (questions, suggestions, bug-reports, etc.) is appreciated.

Sincerely,

Antonio Gasparrini
London School of Hygiene  Tropical Medicine
Department of Social and Environmental Health Research
15-17 Tavistock Place, London WC1H 9SH, UK
Office: 0044 (0)20 79272406
Mobile: 0044 (0)79 64925523
Skype contact: a.gasparrini
http://www.lshtm.ac.uk/people/gasparrini.antonio 

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with mediation

2011-05-12 Thread Dennis Murphy
Hi:

Try this:

library(sos) # install first if you don't have it already
findFn('mediation')

You should find at least a half dozen packages from which to choose,
at least three of which appear to be devoted to mediation analysis.

HTH,
Dennis

On Thu, May 12, 2011 at 8:56 AM, Mervi Virtanen mervi.virta...@uta.fi wrote:



 Hello!

 I have problem with mediation analysis. I can do it with function
 mediate, when I have one mediator. But how I can do it if I have one
 independent variable and one dependent variable but 4 mediators? I
 have try function mediations, but it dosen't work. If I use mediate 4 times,
 each for every mediator, is it same? I want to know what is total mediate
 effect for 4 mediators.

 t.Mete

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: Help with PLSR

2011-05-12 Thread Amit Patel


Hi 
I am attempting to use plsr which is part of the pls package in r. I 
amconducting analysis on datasets to  identify which proteins/peptides are 
responsible for the variance between sample groups (Biomarker Spoting) in a 
multivariate fashion. 


I have a dataset in R called FullDataListTrans. as you can see below the 
structure of the data is 40 different rows representing a sample and 94,272 
columns each representing a peptide.

str(FullDataListTrans)
 num [1:40, 1:94727] 42 40.9 65 56 61.7 ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:40] X X.1 X.12 X.13 ...
  ..$ : NULL

I have also created a vector GroupingList which gives the groupnames for each 
respective sample(row).

 GroupingList
 [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4
[39] 4 4
 str(GroupingList)
 int [1:40] 1 1 1 1 1 1 1 1 1 1 ...

I am now stuck while conducting the plsr. I have tried various methods of 
creating structured lists etc  and have got nowhere. I have also tried many 
incarnations of 


BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = FeaturePresenceExpected[1], data 
= FullDataListTrans, validation = LOO)

Where am I going wrong. Also what is the easiest method to identify which of 
the 
94,000 peptides are most important to the variance between groups.

Thanks in advance for any help

Amit Patel
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Read.xls in gdata

2011-05-12 Thread Bos, Roger
All,  

When I use gdata::read.xls to read in an excel file it seems to round
the data to three decimal places and also converts the dates to factors.
Does anyone know how to 1) get more precision in the numeric data and 2)
how to prevent the dates from being converted to levels or factors?  I
tries settings as.is=TRUE, but that didn't help.

Thanks, Roger
***

This message is for the named person's use only. It may\...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Assistance R

2011-05-12 Thread Carlosmagno
Assistance R, 

When trying to insert data in txt format already set up R pr is the
following error: 


Erro em scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, 
: 
  linha 1 não tinha 10 elementos 


I would like to know how to remedy this error pr I proceeded with my
analysis, because I need to urgently 

Att 

Carlos Magno 

--
View this message in context: 
http://r.789695.n4.nabble.com/Assistance-R-tp3518289p3518289.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extract integers from string

2011-05-12 Thread Alon Honig
I have a vector with a long list of sentences that contain integers. I
would like to extract the integers in a manner such that they are
separate and manipulatable. for example:
x[i] - sally has 20 dollars in her pocket and 3 marbles
x[i+1] -  30 days ago john had a 400k house

all sentences are different and contain a mixture of both integers and
characters.

 i would like to get a conditional matrix such that:

y[i,j] - 20y[i,j+1] - 3
y[i+1,j] - 30 y[i+1,j+1] - 400

based on some criteria (i.e. order, string length, keyword, etc...)
the integers are sorted.


most of my trouble is with finding the correct way to use gsub() or
strsplit() such that the strings are integers that can be inputed into
a matrix.

thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sensitivity Analysis: Morris method - argument scale

2011-05-12 Thread Christoph Warkotsch
Dear R-users,
 
I have a question on the logical argument scale in the morris-function
from the sensitivity package. Should it be set to TRUE or FALSE?
 
Thanks,
 
Chris

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extract integers from string

2011-05-12 Thread Henrique Dallazuanna
Try this:

library(gsubfn)
strapply(x, \\d+, as.numeric, simplify = rbind)

On Thu, May 12, 2011 at 3:06 PM, Alon Honig honey...@gmail.com wrote:
 I have a vector with a long list of sentences that contain integers. I
 would like to extract the integers in a manner such that they are
 separate and manipulatable. for example:
 x[i] - sally has 20 dollars in her pocket and 3 marbles
 x[i+1] -  30 days ago john had a 400k house

 all sentences are different and contain a mixture of both integers and
 characters.

  i would like to get a conditional matrix such that:

 y[i,j] - 20    y[i,j+1] - 3
 y[i+1,j] - 30 y[i+1,j+1] - 400

 based on some criteria (i.e. order, string length, keyword, etc...)
 the integers are sorted.


 most of my trouble is with finding the correct way to use gsub() or
 strsplit() such that the strings are integers that can be inputed into
 a matrix.

 thanks.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assistance R

2011-05-12 Thread Alexander Engelhardt

Am 12.05.2011 20:14, schrieb Carlosmagno:

Assistance R,

When trying to insert data in txt format already set up R pr is the
following error:


Erro em scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
:
   linha 1 não tinha 10 elementos


I would like to know how to remedy this error pr I proceeded with my
analysis, because I need to urgently


We need more information. What command do you use to read the file? How 
do the first 5 lines of your data.txt look like?


 -- Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assistance R

2011-05-12 Thread John Kane


As Alexander Engelhardt says we need more information/

Please give us the code you are using and a sample of the data.

However one thing you might want to do is check that the seperetor is for the 
data. You may be reading something like tab delimited data when you think it is 
comma delimited. Or something similar.

--- On Thu, 5/12/11, Carlosmagno carlosmagno...@ig.com.br wrote:

 From: Carlosmagno carlosmagno...@ig.com.br
 Subject: [R] Assistance R
 To: r-help@r-project.org
 Received: Thursday, May 12, 2011, 2:14 PM
 Assistance R, 
 
 When trying to insert data in txt format already set up R
 pr is the
 following error: 
 
 
 Erro em scan(file, what, nmax, sep, dec, quote, skip,
 nlines, na.strings, 
 : 
   linha 1 não tinha 10 elementos 
 
 
 I would like to know how to remedy this error pr I
 proceeded with my
 analysis, because I need to urgently 
 
 Att 
 
 Carlos Magno 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Assistance-R-tp3518289p3518289.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] DCC-GARCH model and AR(1)-GARCH(1,1) regression model

2011-05-12 Thread Marcin P?�ciennik
Hello,
I have a rather complex problem... I will have to explain everything in
detail because I cannot solve it by myself...i just ran out of ideas. So
here is what I want to do:
I take quotes of two indices - SP500 and DJ. And my first aim is to
estimate coefficients of the DCC-GARCH model for them. This is how I do it:


library(tseries)
p1 = get.hist.quote(instrument = ^gspc,start = 2005-01-07,end =
2009-09-04,compression = w, quote=AdjClose)
p2 = get.hist.quote(instrument = ^dji,start = 2005-01-07,end =
2009-09-04,compression = w, quote=AdjClose)
p = cbind(p1,p2)
y = diff(log(p))*100
y[,1] = y[,1]-mean(y[,1])
y[,2] = y[,2]-mean(y[,2])
T = length(y[,1])

library(ccgarch)
library(fGarch)

f1 = garchFit(~ garch(1,1), data=y[,1],include.mean=FALSE)
f1 = f1@fit$coef
f2 = garchFit(~ garch(1,1), data=y[,2],include.mean=FALSE)
f2 = f2@fit$coef

a = c(f1[1], f2[1])
A = diag(c(f1[2],f2[2]))
B = diag(c(f1[3], f2[3]))
dccpara = c(0.2,0.6)
dccresults = dcc.estimation(inia=a, iniA=A, iniB=B, ini.dcc=dccpara,dvar=y,
model=diagonal)

dccresults$out
DCCrho = dccresults$DCC[,2]
matplot(DCCrho, type='l')

dccresults$out deliver me the estimated coefficients of the DCC-GARCH model.
And here is my first question:
How can I check if these coefficients are significant or not? How can I test
them for significance?

second question would be:
Is this true that matplot(DCCrho, type='l') shows conditional correlation
between the two indices in question?


Ok. This would be it when it comes to DCC-GARCH.

Now, using conditional correlation obtained from the DCC-GARCH model, I want
to test for structural shifts in conditional correlations. To be precise, I
want to test whether the conditional correlations significantly increase in
the turmoil period / during the Subprime crisis.
The regression model is AR(1)-GARCH(1,1), using a dummy variable specified
as:



*** the equations, you can find in the attachment ***



where the first equation is the conditional correlation among the two
indices during the Subprime crisis, Dt is a dummy variable for the turmoil
period, and the second equation (hij,t) is the conditional variance of eij,t

The aim is, of course, to find the estimates of the regression model on
structural shifts in the conditional correlations obtained in the DCC-GARCH
model.

I found an information that there is no function for AR(1)-GARCH(1,1)
regression model. That's why it has to be done in two steps:
1) estimate the AR parameters
2) estimate the GARCH part of the model on the residuals from the AR model

And this would be my rather poor idea of how to do it...


library(timeSeries)
library(fSeries)
step1 = arma(DCCrho, order = c(1,0), include.intercept = TRUE)
step1$res
step11 = na.remove(step1$res)
step2 = garch (step11, order = c(1,1), include.intercept = TRUE)


To be honest I have no clue how to do it. I don't even now why do I get a
missing value as a result of step1 (step1$res[1]) and how to account for it?
Above, I just removed it but then I have a smaller number of
observations...and this is probably wrong.
And then these GARCH estimates on the residuals...does that make sense at
all?


I know the mail is quite looong, but hopefully, someone will find time to
give me a hand because I have to solve the problem and I reached the point
where I cannot move forward without someone's help. There is not much
information on how to apply DCC-GARCH model and AR(1)-GARCH(1,1) regression
model in the Internet. Hopefully, some of you are familiar with it.

Thank you very much in advance, people of good will, for looking at what I
wrote and helping me.

Best regards
Marcin
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] foreach(): how to do calculations between two foreach loops?

2011-05-12 Thread Steve Lianoglou
Hi,

On Wed, May 11, 2011 at 5:44 PM, Marius Hofert m_hof...@web.de wrote:
 Dear expeRts,

 is it possible to carry out calculations between different foreach() calls?
 As for nested loops, you want to carry out calcuations not depending on the 
 inner
 loop only once and not for each iteration of the innermost loop.

 Cheers,

 Marius


 library(foreach)

 foreach(i=1:3) %:%
    foreach(j=1:2) %do% {
            i - i+1
        print(paste(i,j))
    }

 foreach(i=1:3) %:%
    i - i+1 # lengthy calculation which has to be done only once, not for 
 each j
    foreach(j=1:2) %do% {
        print(paste(i,j))
    }

If I understand your question well, one solution might be to break it
into two parallized parts, eg:

R lengthy.calculation.i - foreach(i=1:3) %dopar% someIntenseFunction(i)
R foreach(i=1:3) %:% foreach(j=1:3) %dopar% {
  anotherIntenseFunction(length.calculation.i[i], j)
}

I believe the left and right hand side of the %:% operator need a
foreach object, so your original incantation wouldn't work.

-steve
-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mixed Ordinal logistic regression: marginal probabilities and standard errors for the marginal probabilities

2011-05-12 Thread Ma Ya
Dear R-list helpers:

I am trying to run an ordinal logistic regression using lmer function(I
think that is the correct function, although I have test it out yet), it is
going to be a mixed model with second level as random. Instead the regular
estimate results, I need to get the marginal probabilities and standard
errors for the marginal probabilities. Which option in lmer would give me
those two kind of values?

Thank you.

Ya Ma

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] assigning creating missing rows and values

2011-05-12 Thread Schatzi
I have a dataset where I have missing times (11:00 and 16:00). I would like
the outputs to include the missing time so that the final time vector looks
like realt and has the previous time's value. Ex. If meas at time 15:30 is
0.45, then the meas for time 16:00 will also be 0.45.
meas are the measurements and times are the times at which they were taken.

meas-runif(18)
times-c(08:30,09:00,09:30,10:00,10:30,11:30,12:00,12:30,13:00,13:30,14:00,14:30,15:00,
15:30 ,16:30,17:00,17:30,18:00)
output-data.frame(meas,times)
realt-c(08:30,09:00,09:30,10:00,10:30,11:00,11:30,12:00,12:30,13:00,13:30,14:00,14:30,15:00,15:30,16:00,16:30,17:00,17:30,18:00)


-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/assigning-creating-missing-rows-and-values-tp3518633p3518633.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survival Rate Estimates

2011-05-12 Thread David Winsemius


On May 12, 2011, at 2:19 PM, David Winsemius wrote:



On May 12, 2011, at 12:40 PM, Brian McLoone wrote:


Dear List,

Is there an automated way to use the survival package to generate  
survival

rate estimates and their standard errors?  To be clear, *not *the
survivorship estimates (which are cumulative), but the survival  
*rate *

estimates...


Not entirely clear, but from context I suspect you mean  
instantaneous hazard?


(Survival is not a rate but rather a proportion. Mortality can be a  
rate. The instantaneous hazard is the decrement in survival per unit  
time divided by the survival to that time.)


So at each death the non-parametric estimate would divide current  
deaths (often 1 but ties are possible) by time since last death and  
then divide by proportion surviving.


Or if you have a semi-parametric estimated function for survival  
(such as might be output from `basehaz` which calls `survfit`) take:


-delta_survival/delta_time/survival

tdata - data.frame(time =c(1,1,1,2,2,2,3,3,3,4,4,4),  
status=rep(c(1,0,2),4), n =c(12,3,2,6,2,4,2,0,2,3,3,5))
fit - survfit(Surv(time, time, status, type='interval') ~1,  
data=tdata, weight=n)

 T - c(0, fit$time)


I was doing something else in this session and realized that using 'T'  
was _not_ a good choice here.


 T == TRUE
[1] FALSE  TRUE FALSE FALSE FALSE

I (almost) always spell out TRUE but not everyone does. Better to use  
'sT' or almost anything else.
(But don't use: c, df, C, F, pi, rm, t, qt, pt, rt, dt,, df, rf,  
qf, ... )


 rm(T)
 T == TRUE
[1] TRUE

--
David.



 S - c(1, fit$surv)
 (-diff(S)/diff(T) )/fit$surv
[1] 0.8602308 0.8247746 0.4044324 1.2115931

I don't know if Therneau's opinion about estimating smoothed hazards  
has changed:

http://finzi.psych.upenn.edu/Rhelp10/2009-March/193104.html
There is also a muhaz package which may generate standard errors for  
its estimates but I have read elsewhere that is does not do Cox  
models.

http://finzi.psych.upenn.edu/R/library/muhaz/html/00Index.html
--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assigning creating missing rows and values

2011-05-12 Thread David Winsemius


On May 12, 2011, at 4:33 PM, Schatzi wrote:

I have a dataset where I have missing times (11:00 and 16:00). I  
would like
the outputs to include the missing time so that the final time  
vector looks
like realt and has the previous time's value. Ex. If meas at time  
15:30 is

0.45, then the meas for time 16:00 will also be 0.45.
meas are the measurements and times are the times at which they were  
taken.


meas-runif(18)
times- 
c 
(08 
: 
30 
,09 
: 
00 
,09 
: 
30 
,10 
: 
00 
,10 
:30,11:30,12:00,12:30,13:00,13:30,14:00,14:30,15:00,

15:30 ,16:30,17:00,17:30,18:00)
output-data.frame(meas,times)
realt- 
c 
(08 
: 
30 
,09 
: 
00 
,09 
: 
30 
,10 
: 
00 
,10 
: 
30 
,11 
: 
00 
,11 
: 
30 
,12 
: 
00 
,12 
: 
30 
,13 
: 
00 
,13 
: 
30 
,14 
:00,14:30,15:00,15:30,16:00,16:30,17:00,17:30,18:00)


Package 'zoo' has an 'na.locf' function which I believe stands for  
NA's last observation carried forward. So make a regular set of  
times, merge and carry forward. I'm pretty sure you can find may  
examples in the Archive. Gabor is very good about spotting places  
where his many contributions can be successfully deployed.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assigning creating missing rows and values

2011-05-12 Thread Bert Gunter
...  But beware: Last observation carried forward is a widely used but
notoriously bad (biased) way to impute missing values; and, of course,
inference based on such single imputation is bogus (how bogus depends
on how much imputation, among other things, of course).
Unfortunately, dealing with such data well requires considerable
statistical sophistication, which is why statisticians are widely
employed in the clinical trial business, where missing data in
longitudinal series are relatively common. You may therefore find it
useful to consult a local statistician if one is available.

As an extreme -- and unrealistic -- example of the problem,  suppose
your series consisted of 12 hours of data measured every half hour and
that one series had only two measurements, the first and the last. The
first value is 10 and the last is 1. LOCF would fill in the missings
as all 10's. Obviously, a dumb thing to do. For real data, the problem
would  not be so egregious, but the fundamental difficulty is the
same.

(Apologies to those for whom my post is a familiar, boring refrain.
Unfortunately, I do not have the imagination to offer better).

Cheers,
Bert

On Thu, May 12, 2011 at 1:43 PM, David Winsemius dwinsem...@comcast.net wrote:

 On May 12, 2011, at 4:33 PM, Schatzi wrote:

 I have a dataset where I have missing times (11:00 and 16:00). I would
 like
 the outputs to include the missing time so that the final time vector
 looks
 like realt and has the previous time's value. Ex. If meas at time 15:30
 is
 0.45, then the meas for time 16:00 will also be 0.45.
 meas are the measurements and times are the times at which they were
 taken.

 meas-runif(18)

 times-c(08:30,09:00,09:30,10:00,10:30,11:30,12:00,12:30,13:00,13:30,14:00,14:30,15:00,
 15:30 ,16:30,17:00,17:30,18:00)
 output-data.frame(meas,times)

 realt-c(08:30,09:00,09:30,10:00,10:30,11:00,11:30,12:00,12:30,13:00,13:30,14:00,14:30,15:00,15:30,16:00,16:30,17:00,17:30,18:00)

 Package 'zoo' has an 'na.locf' function which I believe stands for NA's
 last observation carried forward. So make a regular set of times, merge and
 carry forward. I'm pretty sure you can find may examples in the Archive.
 Gabor is very good about spotting places where his many contributions can be
 successfully deployed.

 --

 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assigning creating missing rows and values

2011-05-12 Thread Adele_Thompson
I am still working on the weights problem. If the animals do not eat (like 
after sunset), then no new feed weight will be calculated and no new row will 
be entered. Thus, if I just use the previous value, it should be correct for 
how much cumulative feed was eaten that day up to that point.
I will play around with that package and try getting it to work for me. Thank 
you.

-Original Message-
From: gunter.ber...@gene.com [mailto:gunter.ber...@gene.com] 
Sent: Thursday, May 12, 2011 04:13 PM
To: dwinsem...@comcast.net
Cc: Thompson, Adele - adele_thomp...@cargill.com; r-help@r-project.org
Subject: Re: [R] assigning creating missing rows and values

...  But beware: Last observation carried forward is a widely used but
notoriously bad (biased) way to impute missing values; and, of course,
inference based on such single imputation is bogus (how bogus depends
on how much imputation, among other things, of course).
Unfortunately, dealing with such data well requires considerable
statistical sophistication, which is why statisticians are widely
employed in the clinical trial business, where missing data in
longitudinal series are relatively common. You may therefore find it
useful to consult a local statistician if one is available.

As an extreme -- and unrealistic -- example of the problem,  suppose
your series consisted of 12 hours of data measured every half hour and
that one series had only two measurements, the first and the last. The
first value is 10 and the last is 1. LOCF would fill in the missings
as all 10's. Obviously, a dumb thing to do. For real data, the problem
would  not be so egregious, but the fundamental difficulty is the
same.

(Apologies to those for whom my post is a familiar, boring refrain.
Unfortunately, I do not have the imagination to offer better).

Cheers,
Bert

On Thu, May 12, 2011 at 1:43 PM, David Winsemius dwinsem...@comcast.net wrote:

 On May 12, 2011, at 4:33 PM, Schatzi wrote:

 I have a dataset where I have missing times (11:00 and 16:00). I would
 like
 the outputs to include the missing time so that the final time vector
 looks
 like realt and has the previous time's value. Ex. If meas at time 15:30
 is
 0.45, then the meas for time 16:00 will also be 0.45.
 meas are the measurements and times are the times at which they were
 taken.

 meas-runif(18)

 times-c(08:30,09:00,09:30,10:00,10:30,11:30,12:00,12:30,13:00,13:30,14:00,14:30,15:00,
 15:30 ,16:30,17:00,17:30,18:00)
 output-data.frame(meas,times)

 realt-c(08:30,09:00,09:30,10:00,10:30,11:00,11:30,12:00,12:30,13:00,13:30,14:00,14:30,15:00,15:30,16:00,16:30,17:00,17:30,18:00)

 Package 'zoo' has an 'na.locf' function which I believe stands for NA's
 last observation carried forward. So make a regular set of times, merge and
 carry forward. I'm pretty sure you can find may examples in the Archive.
 Gabor is very good about spotting places where his many contributions can be
 successfully deployed.

 --

 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Saving misclassified records into dataframe within a loop

2011-05-12 Thread John Dennison
Having poked the problem a couple more times it appears my issue is that the
object i save within the loop is not available after the function ends. I
have no idea why it is acting in this manner.


library(rpart)

# grow tree
fit - rpart(Kyphosis ~ Age + Number + Start,
 method=class, data=kyphosis)
#predict
prediction-predict(fit, kyphosis)

#misclassification index function

results-as.data.frame(1)

predict.function - function(x){
  j-0
for (i in 1:length(kyphosis$Kyphosis)) {
if (((kyphosis$Kyphosis[i]==absent)==(prediction[i,1]==1)) == 0 ){

 j-j+1
results[j,]-row.names(testing[c(i),])
print( row.names(kyphosis[c(i),]))
} }
{
print(results)
save(results, file=results) } }


i can load results from file and my out put is there. how ever if i just
type results i get the original 1. what is in the lords name is occurring.

Thanks

John



On Thu, May 12, 2011 at 1:50 PM, Phil Spector spec...@stat.berkeley.eduwrote:

 John -
   In your example, the misclassified observations (as defined by
 your predict.function) will be

  kyphosis[kyphosis$Kyphosis == 'absent'  prediction[,1] != 1,]

 so you could start from there.
- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu



 On Thu, 12 May 2011, John Dennison wrote:

  Greetings R world,

 I know some version of the this question has been asked before, but i need
 to save the output of a loop into a data frame to eventually be written to
 a
 postgres data base with dbWriteTable. Some background. I have developed
 classifications models to help identify problem accounts. The logic is
 this,
 if the model classifies the record as including variable X and it turns
 out
 that record does not have X then it should be reviewed(ie i need the row
 number/ID saved to a database). Generally i want to look at the
 misclassified records. This is a little hack i know, anyone got a better
 idea please let me know. Here is an example

 library(rpart)

 # grow tree
 fit - rpart(Kyphosis ~ Age + Number + Start,
  method=class, data=kyphosis)
 #predict
 prediction-predict(fit, kyphosis)

 #misclassification index function

 predict.function - function(x){
 for (i in 1:length(kyphosis$Kyphosis)) {
 #the idea is that if the record is absent but the prediction is
 otherwise
 then show me that record
 if (((kyphosis$Kyphosis[i]==absent)==(prediction[i,1]==1)) == 0 ){
  #THIS WORKS
 print( row.names(kyphosis[c(i),]))
 }
 } }

 predict.function(x)

 Now my issue is that i want to save these id to a data.frame so i can
 later
 save them to a database. This this an incorrect approach. Can I save each
 id
 to the postgres instance as it is found. i have a ignorant fear of lapply,
 but it seems it may hold the key.


 Ive tried

 predict.function - function(x){
 results-as.data.frame(1)
 for (i in 1:length(kyphosis$Kyphosis)) {
 #the idea is that if the record is absent but the prediction is
 otherwise
 then show me that record
 if (((kyphosis$Kyphosis[i]==absent)==(prediction[i,1]==1)) == 0 ){
  #THIS WORKS
 results[i,]- as.data.frame(row.names(kyphosis[c(i),]))
 }
 } }

 this does not work. results object does not get saved. Any Help would be
 greatly appreciated.


 Thanks

 John Dennison

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R development master class: SF, June 8-9

2011-05-12 Thread Hadley Wickham
Hi all,

I hope you don't mind the slightly off topic email, but I'm going to
be teaching an R development master class in San Francisco on June
8-9.  The basic idea of the class is to help you write better code,
focused on the mantra of do not repeat yourself. In day one you will
learn powerful new tools of abstraction, allowing you to solve a wider
range of problems with fewer lines of code. Day two will teach you how
to make packages, the fundamental unit of code distribution in R,
allowing others to save time by allowing them to use your code.

To get the most out of this course, you should have some experience
programming in R already: you should be familiar with writing
functions, and the basic data structures of R: vectors, matrices,
arrays, lists and data frames. You will find the course particularly
useful if you're an experienced R user looking to take the next step,
or if you're moving to R from other programming languages and you want
to quickly get up to speed with R's unique features.

Both days will incorporate a mix of lectures and hands-on learning.
Expect to learn about a topic and then immediately put it into
practice with a small example. Plenty of help will be available if you
get stuck - there will be one assistant for every 10 attendees. You'll
receive a printed copy of all slides, as well as electronic access to
the slides, code and data. The material covered in the course is
currently being turned into a book. You can access the current draft
at https://github.com/hadley/devtools/wiki/.

More information, including a complete session outline for the two
days is available at:
http://www.revolutionanalytics.com/products/training/public/r-development.php

Regards,

Hadley

PS.  I'm also offering an internet version of the day one through
statistics.com -
http://www.statistics.com/courses/using-r/r-program-adv/

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >