[R] normalité test using over identifying moment conditions

2013-02-01 Thread Usman Gilani
Hi,
I'm trying to interpret the following results, with respect to normality test 
using the over identifying moment conditions

where returns have normal distribution
with parameter mu,sd
and i have 4 moment conditions 

E[r-mu/sd]=0

E[(r-mu)^2/sd-1]=0

E[(r-mu)^3/sd^3]=0

E[(r-mu)^4/sd^4-3]=0

output..
gel(g = g, x = returns, tet0 = c(f3$estimate[1], f3$estimate[2]))

Type of GEL:  EL 

Coefficients:
  Estimate  Std. Error t value   Pr(|t|)
mean  -0.01168   0.05614-0.20805   0.83519
sd 1.77591   0.0396544.79218   0.0

Lambdas:
Estimate   Std. Error  t valuePr(|t|) 
Lambda[1]   -0.097430.03912-2.490280.01276
Lambda[2]0.657280.0244326.905050.0
Lambda[3]0.032470.01304 2.489610.01279
Lambda[4]   -0.109540.00407   -26.904230.0

 Over-identifying restrictions tests: degrees of freedom is 2 
 statistics p-value
LR test   2.3341e+02   2.0730e-51
LM test   7.2417e+02   5.5954e-158
J test  7.2417e+025.5954e-158

Convergence code for the coefficients:  0 

Convergence code for the lambdas:  0 


does the J-test p-value rejecting the null E[g(theta,x)]=0, and which moment 
condition is true under normality

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ks.test and wilcoxon.test results differ from other stat.packages

2013-02-01 Thread Alexander Favorov
Probably, it's an obvious info, but I have not found anything in R FAQ
concerning
this feature/bug.

The results of ks.test and wilcoxon.test (in the Mann-Whitney version,
paired = 'FALSE') don't coincide with the results from the other statistical
packages, e.g. Statistica, Medcalc, and (as for MW test) from the numerous
online MW tests.

E.g.
Statistica p-value=0.0435353
Medcalc p-value=0.0435354
R p-value=0.04635

If I want to obtain result of test once, it does not matter.

But what should I do if I want to perform Monte-Carlo
simulations and I need in 1 or even 100 000 p-values and then will build
some distribution and then use results of Statictica? Whtever, the
descrepancy bothers.

Are there some alternative packages for non-parametric statistics that
produce results comparable with the other program packages? Or, probably,
there is exists some environment variable to perform calculations in more
common way?

Thank you!

Sasha.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] difftime() out by 1 hour

2013-02-01 Thread Boots
I have a problem with results  from difftime being 1 hour different than 
expected. 2 examples are given below:

datetime - matrix(data=rbind(c(2012-03-31 21:00:00, 2012-04-01 00:00:00, 
2012-04-01 03:00:00, 2012-04-01 06:00:00),
   c(2012-10-06 21:00:00, 2012-10-07 00:00:00, 2012-10-07 
03:00:00, 2012-10-07 06:00:00)), ncol=4, nrow=2)

for (row in 1:2) {
  x - datetime[row,]
  x1=x[1:length(x)-1]
  x2=x[2:length(x)]
  xd=difftime(x2,x1)
  print (x)
  print (xd)
}

[1] 2012-03-31 21:00:00 2012-04-01 00:00:00 2012-04-01 03:00:00 
2012-04-01 06:00:00
Time differences in hours
[1] 3 4 3
attr(,tzone)
[1] 
[1] 2012-10-06 21:00:00 2012-10-07 00:00:00 2012-10-07 03:00:00 
2012-10-07 06:00:00
Time differences in hours
[1] 3 2 3
attr(,tzone)
[1] 

Accurate results would be 3 3 3 (hours) in both cases.
The problem occurs randomly, but only around midnight.
All the dates between 2012-04-01 06:00:00 and 2012-10-06 21:00:00 are fine.

I have seen postings about problems with POSIXlt and POSIXct accuracy 
because of truncation rather than rounding of floating point representations. 
The response was that it was expected behaviour.  

So my question is this expected behaviour, have I used difftime incorrectly, 
and if not, is there a set of date/time functions and classes that use
integer representation to give more accurate results? I am new to R and I may 
well be using the wrong set of functions.
Garry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Filter according to the latest data

2013-02-01 Thread Mat
Hello together,

i have a data.frame, like this one:
 No.  Change   Date  
A  123   final2013-01-15
B  123   error   2013-01-16
C  123   bug fixed   2013-01-17
D  111   final2013-01-12

and now a want a new data.frame which includes only the newest entry for
each number.
The solution look like this:
 No.  Change   Date  
C  123   bug fixed   2013-01-17
D  111   final2013-01-12

is there any way to filter my data.frame to the latest data, perhabs max?

Thanks.

Mat



--
View this message in context: 
http://r.789695.n4.nabble.com/Filter-according-to-the-latest-data-tp4657248.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] difftime() out by 1 hour

2013-02-01 Thread Pascal Oettli

Hello,

Here is the result I get using your script:

 sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-suse-linux-gnu (64-bit)


[1] 2012-03-31 21:00:00 2012-04-01 00:00:00 2012-04-01 03:00:00
[4] 2012-04-01 06:00:00
Time differences in hours
[1] 3 3 3
attr(,tzone)
[1] 
[1] 2012-10-06 21:00:00 2012-10-07 00:00:00 2012-10-07 03:00:00
[4] 2012-10-07 06:00:00
Time differences in hours
[1] 3 3 3
attr(,tzone)
[1] 

Regards,
Pascal


Le 01/02/2013 16:27, Boots a écrit :

I have a problem with results  from difftime being 1 hour different than 
expected. 2 examples are given below:

datetime - matrix(data=rbind(c(2012-03-31 21:00:00, 2012-04-01 00:00:00, 2012-04-01 
03:00:00, 2012-04-01 06:00:00),
c(2012-10-06 21:00:00, 2012-10-07 00:00:00, 2012-10-07 
03:00:00, 2012-10-07 06:00:00)), ncol=4, nrow=2)

for (row in 1:2) {
   x - datetime[row,]
   x1=x[1:length(x)-1]
   x2=x[2:length(x)]
   xd=difftime(x2,x1)
   print (x)
   print (xd)
}

[1] 2012-03-31 21:00:00 2012-04-01 00:00:00 2012-04-01 03:00:00 2012-04-01 
06:00:00
Time differences in hours
[1] 3 4 3
attr(,tzone)
[1] 
[1] 2012-10-06 21:00:00 2012-10-07 00:00:00 2012-10-07 03:00:00 2012-10-07 
06:00:00
Time differences in hours
[1] 3 2 3
attr(,tzone)
[1] 

Accurate results would be 3 3 3 (hours) in both cases.
The problem occurs randomly, but only around midnight.
All the dates between 2012-04-01 06:00:00 and 2012-10-06 21:00:00 are fine.

I have seen postings about problems with POSIXlt and POSIXct accuracy
because of truncation rather than rounding of floating point representations.
The response was that it was expected behaviour.

So my question is this expected behaviour, have I used difftime incorrectly, 
and if not, is there a set of date/time functions and classes that use
integer representation to give more accurate results? I am new to R and I may 
well be using the wrong set of functions.
Garry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] use name (not values!) of a dataframe inside a funktion

2013-02-01 Thread PIKAL Petr
Hi

Maybe others can help you but here is my comment. I already use R for many 
years and never used such construction. Objects in global environment shall not 
be modified by functions it is a bad practice. Imagine you have some data frame 
you prepared and controlled in many steps and use some function from a package. 

If this function scrambles object without warning and the result cannot be 
easily repaired I would be tempted to curse the function with many nasty four 
letter words.

When I want to change some data by a function I use

fff - function(x, no=2) {
x[,no]-factor(x[, no])
x
}
 
dfb.f - fff(dfb)

In this case I will end with old object dfb and new object dfb.f. Of course it 
is possible to use

dfb - fff(dfb)

and in this case dfb object is changed

Petr

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Winfried Moser
 Sent: Thursday, January 31, 2013 3:35 PM
 To: r-help
 Subject: [R] use name (not values!) of a dataframe inside a funktion
 
 Dear Listers,
 
 can anyone help me, please.
 
 Since several days i try to figure out, how to assign values, vectors,
 functions etc to variables with dynamically generated names inside of
 functions.
 Sometimes I succeed, but the success is rather arbitrary, it seems. up
 to now i don't fully understand, why things like get, assign, - etc
 do sometimes work, and sometimes not.
 
 here's one of my daily examples, i am stuck with: Example 1 does work,
 but example 2 doesn't?
 How kann i tell R, that i want it to expand the string dfb to
 dfb[,2]
 inside the function.
 In the end i want the function to change the second variable of the
 dataframe dfb permanently to factor (not just inside the function).
 
 Thanks in advance!
 
 Winfried
 
 
 Example 1:
 dfa - data.frame(a=c(1:4),b=c(1:4))
 dfa[,2] - factor(dfa[,2])
 is.factor(dfa[,2])
 TRUE
 
 Example 2:
 dfb - data.frame(a=c(1:4),b=c(1:4))
 f.fact - function(x) {x[,2] - factor(x[,2])}
 f.fact(dfb)
 is.factor(dfb[,2])
 FALSE
 
 
 PS: I tried a whole lot of other things like, ...
 I really don't know where to keep on searching.
 
 
 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {get(x)[,2] - factor(get(x)[,2])}
 f.fact(dfb)
 is.factor(dfb[,2])
  Object 'x' nicht gefunden
 
 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {get(x[,2]) - factor(x[,2])}
 f.fact(dfb)
 is.factor(dfb[,2])
  Object 'x' nicht gefunden
 
 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {get(x)[,2] - factor(x[,2])}
 f.fact(dfb)
 is.factor(dfb[,2])
  Object 'x' nicht gefunden
 
 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {assign(x[,2], factor(x[,2]))}
 f.fact(dfb)
 is.factor(dfb[,2])
  Ungültiges erstes Argument
 
 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {quote(x)[,2], factor(x[,2])}
 f.fact(dfb)
 is.factor(dfb[,2])
  Unerwartetes ','
 
 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {
 name - paste0(quote(x),[,2])
  assign(name, factor(x[,2]))}
 f.fact(dfb)
 is.factor(dfb[,2])
  FALSE
 
 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {
 name - paste0(get(x),[,2])
 assign(name, factor(x[,2]))}
 f.fact(dfb)
 is.factor(dfb[,2])
  Falsche Anzahl von Dimensionen
 
 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {
  name - paste0(x,[,2])
 assign(name, factor(x[,2]))}
 f.fact(dfb)
 is.factor(dfb[,2])
  Falsche Anzahl von Dimensionen
 
 ächz ...
 
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transforming 4x3 data frame into 2 column df in R

2013-02-01 Thread Jorge I Velez
Dear Gundala,

Try

as.data.frame.table(foo)

HTH,
Jorge.-


On Fri, Feb 1, 2013 at 5:00 PM, Gundala Viswanath  wrote:

 I have the following data frame:

  foo
w x y z
 n 1.51550092 1.4337572 1.2791624 1.1771230
 q 0.09977303 0.8173761 1.6123402 0.1510737
 r 1.17083866 1.2469347 0.8712135 0.8488029

 What I want to do is to change it into :

  newdf
 1 nw 1.51550092
 2 q   w 0.09977303
 3 r   w 1.17083866
 4 nx 1.43375725
 5 q   x 0.81737606
 6 r   x 1.24693468
 7 n   y 1.27916241
 8 q   y 1.61234016
 9 r   y 0.87121353
 10   nz 1.17712302
 11   q   z 0.15107369
 12   rz 0.84880292

 Whtat's the way to do it?

  - G.V.

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] facet plot

2013-02-01 Thread Dennis Murphy
Hi:

Try

DF - structure(list(Gene = structure(1:5, .Label = c(NM_001001130,
NM_001001144, NM_001001152, NM_001001160, NM_001001176
), class = factor), T0h = c(68L, 0L, 79L, 1L, 0L), T0.25h = c(95L,
1L, 129L, 1L, 0L), T0.5h = c(56L, 4L, 52L, 2L, 0L), T1h = c(43L,
0L, 50L, 0L, 0L), T2h = c(66L, 1L, 24L, 0L, 0L), T3h = c(62L,
1L, 45L, 1L, 0L), T6h = c(68L, 1L, 130L, 0L, 0L), T12h = c(90L,
4L, 154L, 0L, 0L), T24h = c(63L, 1L, 112L, 0L, 0L), T48h = c(89L,
2L, 147L, 1L, 0L)), .Names = c(Gene, T0h, T0.25h, T0.5h,
T1h, T2h, T3h, T6h, T12h, T24h, T48h), class =
data.frame, row.names = c(1, 2, 3, 4, 5))

library(reshape2)
library(ggplot2)

# stack the time-related variables by Gene
Dfm - melt(DF, id = Gene)
# extract the numeric time values
Dfm$time - as.numeric(gsub([a-zA-Z], , as.character(Dfm$variable)))

ggplot(Dfm, aes(x = time, y = value)) + geom_point() + facet_wrap(~ Gene)


Dennis

On Thu, Jan 31, 2013 at 2:37 AM, Jose Iparraguirre
jose.iparragui...@ageuk.org.uk wrote:
 Hi Robin,

 You didn't provide the list with a clear structure of your data, but I hope I 
 understood it properly. If not, it'd be easy for you to adapt what follows.

 I guess your data looks like this:

Gene  T0h T0.25h T0.5h  T1h  T2h  T3h  T6h T12h T24h T48h
 1  NM_001001130   68 9556   43   66   62   68   90   63   89
 2  NM_0010011440  1 40111412
 3  NM_001001152   7912952   50   24   45  130  154  112  147
 4  NM_0010011601  1 20010001
 5  NM_0010011760  0 00000000
 ...

 If this is the case, first reshape your data (say, gene.df) to a long format:

 Gene Time
 1   NM_001001130   68
 2   NM_0010011440
 3   NM_001001152   79
 4   NM_0010011601
 5   NM_0010011760
 6   NM_0010011771
 7   NM_0010011780
 8   NM_0010011790
 9   NM_001001182 4691
 ...

 Then, you create the ggplot:

 sp - ggplot(gene.df, aes(x=Gene, y=Time)) + geom_point(shape=1)

 And use the function facet_wrap():

 sp + facet_wrap( ~ Gene,ncol=3)

 You can set a different number of columns and a different shape for the 
 markers, of course.

 Hope this helps.

 Regards,

 José


 José Iparraguirre
 Chief Economist
 Age UK


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Robin Mjelle
 Sent: 31 January 2013 09:06
 To: r-help@r-project.org
 Subject: [R] facet plot

 Hi all,

 I want to plot a facet plot with column names as x and column values as y.
 One plot for each row. here is part of my dataset:

 Gene T0h T0.25h T0.5h T1h T2h T3h T6h T12h T24h T48h  NM_001001130 68 95
 56 43 66 62 68 90 63 89  NM_001001144 0 1 4 0 1 1 1 4 1 2  NM_001001152 79
 129 52 50 24 45 130 154 112 147  NM_001001160 1 1 2 0 0 1 0 0 0 1
 NM_001001176 0 0 0 0 0 0 0 0 0 0  NM_001001177 1 3 2 3 0 1 1 0 2 0
 NM_001001178 0 0 0 0 0 0 0 0 0 0  NM_001001179 0 0 0 0 0 0 0 0 0 0
 NM_001001180 1 0 0 1 0 0 0 0 0 0  NM_001001181 350 539 424 470 441 451 554
 553 419 548  NM_001001182 4691 6458 5686 6761 5944 4486 6030 5883 6009 7552
 NM_001001183 22 23 12 21 25 17 34 40 33 32  NM_001001184 138 147 111 100 54
 61 61 77 57 99  NM_001001185 2 3 4 3 4 5 9 2 4 4  NM_001001186 231 337 187
 168 148 178 275 319 181 279

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 Wrap Up and Run 10k is back!

 Also, new for 2013 – 2km intergenerational walks at selected venues. So 
 recruit a buddy, dust off the trainers and beat the winter blues by
 signing up now:

 http://www.ageuk.org.uk/10k

  Milton Keynes | Oxford | Sheffield | Crystal Palace | Exeter 
 | Harewood House, Leeds |
  Tatton Park, Cheshire | Southampton | 
 Coventry



 Age UK Improving later life

 http://www.ageuk.org.uk




 ---
 Age UK is a registered charity and company limited by guarantee, (registered 
 charity number 1128267, registered company number 6825798).
 Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA.

 For the purposes of promoting Age UK Insurance, Age UK is an Appointed 
 Representative of Age UK Enterprises Limited, Age UK is an Introducer
 Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth 
 Access for the purposes of introducing potential annuity and health
 cash plans customers respectively.  Age UK Enterprises Limited, JLT Benefit 
 Solutions Limited and Simplyhealth Access are all authorised and
 regulated by the Financial Services Authority.
 --

 This email and any files transmitted with it are confidential and intended 
 solely for the use of the individual 

Re: [R] Nested loop and output help

2013-02-01 Thread PIKAL Petr
Hi

see inline

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of staysafe23
 Sent: Friday, February 01, 2013 1:01 AM
 To: r-help@r-project.org
 Subject: [R] Nested loop and output help
 
 Hello Everyone,
 
 My name is Thomas and I have been using R for one week. I recently
 found your site and have been able to search the archives of posts.
 This has given me some great information that has allowed me to craft
 an initial design to an inquiry I would like to make into the breakdown
 of McNemar's test. I have read an intro to R manual and the posting
 guides and hope I am not violating them with this post. If so I will
 re-ask my question in the proper format.
 
 I have succeeded in writing a loop to vary one condition of my inquiry
 but I am unable to understand how to vary the remaining three
 conditions, each with its own for loop. Each time I try to do so I fail
 miserably. Here is my current code :
 
 n - seq(5,10,by=1)
 
 for(i in n) {
 
 z1 - rnorm(n,mean=400, sd=70)
 
 z2 - rnorm(n,mean=450, sd=90)
 
 cor - runif(1,min=0.4, max=0.6)
 
 X - z1
 
 Y = cor*z1+(1-cor)*z2
 
 mat1 - cbind(X,Y)
 
 dev1 - sample(-40:40, 1, replace=T)
 
 cut1 - mean(X) + dev1
 
 dev2 - sample(12:54, 1, replace=T)
 
 cut2 - mean(X) + dev1 + dev2
 
 X2 - X-cut1
 
 Y2 - Y-cut2
 
 c3 - cor(X2,Y2)
 
 mat2 -cbind(X2,Y2)
 
 a11 - ifelse( X  cut1  Y  cut2, 1, 0)
 
 a12 - ifelse( X  cut1  Y = cut2, 1, 0)
 
 a21 - ifelse( X = cut1  Y  cut2, 1, 0)
 
 a22 - ifelse( X = cut1  Y = cut2, 1, 0)
 
 mat3 -matrix(c(sum(a11),sum(a21), sum(a12),sum(a22)), nrow = 2)
 
 mat4 -matrix(c(sum(a11),sum(a22), sum(a12),sum(a21)), nrow = 2)
 
 out3a - mcnemar.test(mat3, correct=FALSE)
 
 out3b - mcnemar.test(mat3, correct=TRUE)
 
 out4a - chisq.test(mat4, correct = FALSE)
 
 out4b - chisq.test(mat4, correct = TRUE)
 
 print(mat1)
 
 print(mat2)
 
 print(cut1)
 
 print(cut2)
 
 print(mat3)
 
 print(out3a)
 
 print(out3b)
 
 print(mat4)
 
 print(out4a)
 
 print(out4b)
 
 }
 
 .

Use list structure for keeping such results.

lll-list()
for(j in 1:5) {
lll[[j]] - list()
for( i in 1:3) lll[[j]][[i]]-rnorm(10)
}

after population of a list you can print it as a whole or only part. Here is an 
example with your code.

n - seq(5,10,by=1)

lll - vector(mode = list, length = 10)
names(lll) - c(mat1, mat2, mat3, mat4, cut1, cut2, out3a, 
out3b, out4a, out4b)

n - seq(5,10,by=1)

for(i in n) {
z1 - rnorm(n,mean=400, sd=70)
z2 - rnorm(n,mean=450, sd=90)
cor - runif(1,min=0.4, max=0.6)
X - z1
Y = cor*z1+(1-cor)*z2

lll[[mat1]] - cbind(X,Y)
dev1 - sample(-40:40, 1, replace=T)

lll[[cut1]] - mean(X) + dev1
dev2 - sample(12:54, 1, replace=T)

lll[[cut2]] - mean(X) + dev1 + dev2
X2 - X-cut1
Y2 - Y-cut2
c3 - cor(X2,Y2)

lll[[mat2]] -cbind(X2,Y2)
a11 - ifelse( X  cut1  Y  cut2, 1, 0)
a12 - ifelse( X  cut1  Y = cut2, 1, 0)
a21 - ifelse( X = cut1  Y  cut2, 1, 0)
a22 - ifelse( X = cut1  Y = cut2, 1, 0)

lll[[mat3]] -matrix(c(sum(a11),sum(a21), sum(a12),sum(a22)), nrow = 2)
lll[[mat4]] -matrix(c(sum(a11),sum(a22), sum(a12),sum(a21)), nrow = 2)
lll[[out3a]] - mcnemar.test(mat3, correct=FALSE)
lll[[out3b]] - mcnemar.test(mat3, correct=TRUE)
lll[[out4a]] - chisq.test(mat4, correct = FALSE)
lll[[out4b]] - chisq.test(mat4, correct = TRUE)
}

print(lll)

 
 Other than the sample size of the random draws I would like to nest for
 loops for cor, dev1, and dev2 according to the following sequences
 respectively:
 
 cor - seq(-0.5,0.5, by=0.25)

do not use floating points in cycle.

better to use

for (k in 1:n) {
cc - cor[k]
make computation with cc
}


 
 dev1 - seq(-100,100, by=10)
 
 dev2 - seq(12,54, by=10)
 
 .
 
 After doing so I would like to put each matrix and their respective
 tests into a text file so that I can examine the results. I would like
 to put the results in a .txt file each time the loop finishes one case.
 I would like to append this text file with subsequent matrices and
 results rendered by each iteration of the nested for loop.  I have seen
 some very nice examples of output that R can render. I would like to
 simply display each matrix and their tests.

maybe R2HTML or latex in Hmisc package can 

Regards 
Petr

 
 Thank you to all the teachers and students on this forum. The only
 reason I have been able to craft this inquiry is due to the questions
 and answers I have found through searching the archive. Thank you
 kindly for your assistance and for freely sharing your knowledge.
 
 Best wishes,
 Thomas
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

Re: [R] difftime() out by 1 hour

2013-02-01 Thread PIKAL Petr
Hi

The same with me on Windows
Most probably an issue of daylight savings time setting.

Sys.timezone()
[1] CET

Regards
Petr


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Pascal Oettli
 Sent: Friday, February 01, 2013 9:18 AM
 To: Boots
 Cc: R Help
 Subject: Re: [R] difftime() out by 1 hour
 
 Hello,
 
 Here is the result I get using your script:
 
   sessionInfo()
 R version 2.15.2 (2012-10-26)
 Platform: x86_64-suse-linux-gnu (64-bit)
 
 
 [1] 2012-03-31 21:00:00 2012-04-01 00:00:00 2012-04-01 03:00:00
 [4] 2012-04-01 06:00:00
 Time differences in hours
 [1] 3 3 3
 attr(,tzone)
 [1] 
 [1] 2012-10-06 21:00:00 2012-10-07 00:00:00 2012-10-07 03:00:00
 [4] 2012-10-07 06:00:00
 Time differences in hours
 [1] 3 3 3
 attr(,tzone)
 [1] 
 
 Regards,
 Pascal
 
 
 Le 01/02/2013 16:27, Boots a écrit :
  I have a problem with results  from difftime being 1 hour different
 than expected. 2 examples are given below:
 
  datetime - matrix(data=rbind(c(2012-03-31 21:00:00, 2012-04-01
 00:00:00, 2012-04-01 03:00:00, 2012-04-01 06:00:00),
  c(2012-10-06 21:00:00, 2012-10-07 00:00:00,
  2012-10-07 03:00:00, 2012-10-07 06:00:00)), ncol=4, nrow=2)
 
  for (row in 1:2) {
 x - datetime[row,]
 x1=x[1:length(x)-1]
 x2=x[2:length(x)]
 xd=difftime(x2,x1)
 print (x)
 print (xd)
  }
 
  [1] 2012-03-31 21:00:00 2012-04-01 00:00:00 2012-04-01 03:00:00
 2012-04-01 06:00:00
  Time differences in hours
  [1] 3 4 3
  attr(,tzone)
  [1] 
  [1] 2012-10-06 21:00:00 2012-10-07 00:00:00 2012-10-07 03:00:00
 2012-10-07 06:00:00
  Time differences in hours
  [1] 3 2 3
  attr(,tzone)
  [1] 
 
  Accurate results would be 3 3 3 (hours) in both cases.
  The problem occurs randomly, but only around midnight.
  All the dates between 2012-04-01 06:00:00 and 2012-10-06 21:00:00
 are fine.
 
  I have seen postings about problems with POSIXlt and POSIXct
  accuracy because of truncation rather than rounding of floating point
 representations.
  The response was that it was expected behaviour.
 
  So my question is this expected behaviour, have I used difftime
  incorrectly, and if not, is there a set of date/time functions and
 classes that use integer representation to give more accurate results?
 I am new to R and I may well be using the wrong set of functions.
  Garry
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cumulative Incidence Function and Pseudovalue

2013-02-01 Thread Tasnuva Tabassum
Hi,
I want to write own R functions for cumulative incidence function and also
for the pseudovalue of the cumulative incidence function.
Can you help me?
Tas.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Was confused with options(error = expression(NULL)) in example(stop)

2013-02-01 Thread Suharto Anggono Suharto Anggono
In example for function 'stop' in R, there is
options(error = expression(NULL))
with comment
# don't stop on stop(.)   Use with CARE! 

I was interested, wanted to know how don't stop on stop(.) was. So, I tried 
it.

Typing
example(stop)
at the R prompt and pressing ENTER give this.

 example(stop)

stop options(error = expression(NULL))

stop # don't stop on stop(.)   Use with CARE! 
stop
stop iter - 12

stop if(iter  10) stop(too many iterations)
Error in eval(expr, envir, enclos) : too many iterations


R still stops on stop(.).

That confused me. Then, I tried manually. But I still couldn't get
options(error = expression(NULL))
worked. Everything seemed to be as usual. I gave up.

Some time later, I saw somewhere in the internet that
options(error = expression(NULL))
worked in batch mode. Ah, running from R CMD BATCH. It had no effect in 
interactive session. I hadn't thought of it before.


 sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: i386-w64-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.15.2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using eigen() for extracting only few major eigenpairs

2013-02-01 Thread Pierrick Bruneau
Thanks a lot, I'll test this package very soon.
As it seems general purpose (ie not specifically fitted to the square
symmetric context), I hope the advantage over the standard routine for
symmetric matrices remains significant.

Pierrick Bruneau
CRP Gabriel Lippmann

On Thursday, January 31, 2013, Mark Leeds wrote:

 hi: the irlba package does what you're looking for.


 On Thu, Jan 31, 2013 at 3:32 AM, Pierrick Bruneau 
 pbrun...@gmail.comjavascript:_e({}, 'cvml', 'pbrun...@gmail.com');
  wrote:

 Hi everyone,

 I am using eigen() to extract the 2 major eigenpairs from a large real
 square symmetric matrix. The procedure is already rather efficient, but
 becomes somehow slow for real time needs with moderately large matrices
 (few thousand lines).

 The R implementation statically extracts all eigenvalues (and optionally
 associated eigenvectors). I heard about optimizations of the eigen
 decomposition when only few eigenpairs are needed : did somebody already
 care about this problem (through a contributed package for example) ? Or
 do
 I have to directly try to mess around with the LAPACK library (and
 contribute the package myself afterwards) ?

 Thanks by advance for your help,
 Pierrick Bruneau
 CRP Gabriel Lippmann

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org javascript:_e({}, 'cvml', 
 'R-help@r-project.org');mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to setdiff on lists of lists

2013-02-01 Thread Simone Gabbriellini
Dear List,

I have a data structure like:

 aa
[[1]]
[1] 1 2 3

[[2]]
[1] 4 5 6

 bb
[[1]]
[1] 3

[[2]]
[1] 5

I would like to set differences between aa and bb and get as
result another list of lists like:

 res
[[1]]
[1] 1 2

[[2]]
[1] 4 6

I am trying:

lapply(aa, setdiff, bb)

but I simply get aa back as result. Could you please point me in the
right direction about the combination of functions that I need to
accomplish this task?

Best regards,
Simone

-- 
Simone Gabbriellini, PhD

PostDoc@DISI, University of Bologna
mobile: +39 340 39 75 626
email: simone.gabbriell...@unibo.it
home: www.digitaldust.it

DigitalBrains srl
Amministratore
mobile: +39 340 39 75 626
email: simone.gabbriell...@digitalbrains.it
home: www.digitalbrains.it

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to setdiff on lists of lists

2013-02-01 Thread PIKAL Petr
Hi

you have something wrong in your R session, works for me.

 aa-list(c(1:3), c(4:6))
 aa
[[1]]
[1] 1 2 3

[[2]]
[1] 4 5 6

 bb-list(3,5)
 bb
[[1]]
[1] 3

[[2]]
[1] 5

 lapply(aa, setdiff, bb)
[[1]]
[1] 1 2

[[2]]
[1] 4 6

Regards
Petr


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Simone Gabbriellini
 Sent: Friday, February 01, 2013 11:16 AM
 To: r-help@r-project.org
 Subject: [R] how to setdiff on lists of lists
 
 Dear List,
 
 I have a data structure like:
 
  aa
 [[1]]
 [1] 1 2 3
 
 [[2]]
 [1] 4 5 6
 
  bb
 [[1]]
 [1] 3
 
 [[2]]
 [1] 5
 
 I would like to set differences between aa and bb and get as result
 another list of lists like:
 
  res
 [[1]]
 [1] 1 2
 
 [[2]]
 [1] 4 6
 
 I am trying:
 
 lapply(aa, setdiff, bb)
 
 but I simply get aa back as result. Could you please point me in the
 right direction about the combination of functions that I need to
 accomplish this task?
 
 Best regards,
 Simone
 
 --
 Simone Gabbriellini, PhD
 
 PostDoc@DISI, University of Bologna
 mobile: +39 340 39 75 626
 email: simone.gabbriell...@unibo.it
 home: www.digitaldust.it
 
 DigitalBrains srl
 Amministratore
 mobile: +39 340 39 75 626
 email: simone.gabbriell...@digitalbrains.it
 home: www.digitalbrains.it
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ks.test and wilcoxon.test results differ from other stat.packages

2013-02-01 Thread Meyners, Michael
Impossible to say w/o a reproducible example, but to start with let me suggest 
looking at the exact= (both functions) and correct= (wilcox.test) arguments. 
Experience shows that some change of the default settings allows you to 
reproduce results from other software (and the help pages will explain you what 
the settings do; by that, you might also learn what your other tools actually 
calculate if you have no documentation at hand).

HTH, Michael

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Alexander Favorov
 Sent: Freitag, 1. Februar 2013 09:00
 To: r-help@r-project.org
 Subject: [R] ks.test and wilcoxon.test results differ from other
 stat.packages
 
 Probably, it's an obvious info, but I have not found anything in R FAQ
 concerning this feature/bug.
 
 The results of ks.test and wilcoxon.test (in the Mann-Whitney version,
 paired = 'FALSE') don't coincide with the results from the other
 statistical packages, e.g. Statistica, Medcalc, and (as for MW test)
 from the numerous online MW tests.
 
 E.g.
 Statistica p-value=0.0435353
 Medcalc p-value=0.0435354
 R p-value=0.04635
 
 If I want to obtain result of test once, it does not matter.
 
 But what should I do if I want to perform Monte-Carlo simulations and I
 need in 1 or even 100 000 p-values and then will build some
 distribution and then use results of Statictica? Whtever, the
 descrepancy bothers.
 
 Are there some alternative packages for non-parametric statistics that
 produce results comparable with the other program packages? Or,
 probably, there is exists some environment variable to perform
 calculations in more common way?
 
 Thank you!
 
 Sasha.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to

2013-02-01 Thread Simone Gabbriellini
Dear List,

I have a list of lists and on each list I want to apply a function
from the igraph package, graph.knn. I need to calculate graph.knn for
each list on a graph g, then retrieve one of the result called $knn
and calculate its mean. The non-R style code looks like this:

for(i in listOfLists){
print(mean(graph.knn(g, i)$knn))
}

Is there any way to convert this in a one-liner? I have tried to
figure it out with lapply() or mapply() but with no success.

Thank you in advance for your help.

Simone

-- 
Simone Gabbriellini, PhD

PostDoc@DISI, University of Bologna
mobile: +39 340 39 75 626
email: simone.gabbriell...@unibo.it
home: www.digitaldust.it

DigitalBrains srl
Amministratore
mobile: +39 340 39 75 626
email: simone.gabbriell...@digitalbrains.it
home: www.digitalbrains.it

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help on proportions

2013-02-01 Thread Ista Zahn
Hi Srini,

This is a statistics question, not a question about R, so this may not
be the best place to ask. Try posting at
http://stats.stackexchange.com/ or another statistics help list.

Best,
Ista

On Thu, Jan 31, 2013 at 11:11 PM, Srinivas Iyyer
srini_iyyer_...@yahoo.com wrote:
 Hi:
 Apologies for asking the following question. As this may sound very basic and 
 stupid for this forum , I honestly do not know how to solve it and I do not 
 have a teacher who can help me understand.

 I have list of genes (200) that are involved in a particular process and I 
 call this as a ProcSet.   From an independent experiment I found that out of 
 10,000 genes, 1500 are significant and I call these1500 genes as ResultSet.

 The intersection of ResultSet and ProcSet are 80 genes.

 That means 40% of ProcSet are significant.

  How do I calculate that 40% is significant and more than I expect by chance 
 given ResultSet and 10,000 genes I evaluated in the experiment.

 What I have:
 n = 200 (ProcSet)
 p = 0.4

 N = 1500  (ResultSet)

 N1 =10,000

 Pn = 0.15

 What kind of test will help me know that 0.4 is significant given 0.15. Any 
 suggestions will greatly help me.

 Thank you.
 Srini

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question on plotCI function

2013-02-01 Thread Ista Zahn
There are many plotCI functions in many different packages... which
one are you referring to? Also please construct a reproducible example
illustrating your problem.

Best,
Ista

On Thu, Jan 31, 2013 at 11:58 PM, li li hannah@gmail.com wrote:
 Hi all,
In my plotCI function, the argument x is chosen to be seq(0.05, 0.95,
 by=0.05).
 However, when I make the plot, the plot has the x coordinate goes  1:19.
Does anyone know how to make the x coordinate to be (0,0.5, 0.1, ...,
 0.95).
Thank you.
   Hanna

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to

2013-02-01 Thread PIKAL Petr
Hi

I do not see any problem with your code. *apply functions are also hidden 
cycles and there shall not be substantial improvement in speed.

Why you do not want to use for cycle?

Petr


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Simone Gabbriellini
 Sent: Friday, February 01, 2013 12:54 PM
 To: r-help@r-project.org
 Subject: [R] how to
 
 Dear List,
 
 I have a list of lists and on each list I want to apply a function from
 the igraph package, graph.knn. I need to calculate graph.knn for each
 list on a graph g, then retrieve one of the result called $knn and
 calculate its mean. The non-R style code looks like this:
 
 for(i in listOfLists){
   print(mean(graph.knn(g, i)$knn))
 }
 
 Is there any way to convert this in a one-liner? I have tried to figure
 it out with lapply() or mapply() but with no success.
 
 Thank you in advance for your help.
 
 Simone
 
 --
 Simone Gabbriellini, PhD
 
 PostDoc@DISI, University of Bologna
 mobile: +39 340 39 75 626
 email: simone.gabbriell...@unibo.it
 home: www.digitaldust.it
 
 DigitalBrains srl
 Amministratore
 mobile: +39 340 39 75 626
 email: simone.gabbriell...@digitalbrains.it
 home: www.digitalbrains.it
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filter according to the latest data

2013-02-01 Thread jim holtman
try this:

 x - read.table(text =  No.  Change   Date
+ A  123   final2013-01-15
+ B  123   error   2013-01-16
+ C  123   'bug fixed'   2013-01-17
+ D  111   final2013-01-12
+ , header = TRUE
+ , as.is = TRUE
+ )
 do.call(rbind, lapply(split(x, x$No.), function(.sec){
+ .sec[which(.sec$Date == max(.sec$Date))[1L], ]
+ }))
No.Change   Date
111 111 final 2013-01-12
123 123 bug fixed 2013-01-17


On Fri, Feb 1, 2013 at 3:04 AM, Mat matthias.we...@fnt.de wrote:
 Hello together,

 i have a data.frame, like this one:
  No.  Change   Date
 A  123   final2013-01-15
 B  123   error   2013-01-16
 C  123   bug fixed   2013-01-17
 D  111   final2013-01-12

 and now a want a new data.frame which includes only the newest entry for
 each number.
 The solution look like this:
  No.  Change   Date
 C  123   bug fixed   2013-01-17
 D  111   final2013-01-12

 is there any way to filter my data.frame to the latest data, perhabs max?

 Thanks.

 Mat



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Filter-according-to-the-latest-data-tp4657248.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to setdiff on lists of lists

2013-02-01 Thread jim holtman
You can also try 'mapply'

 aa-list(c(1:3), c(4:6))
  bb-list(3,5)
 mapply(setdiff, aa, bb)
 [,1] [,2]
[1,]14
[2,]26



On Fri, Feb 1, 2013 at 5:16 AM, Simone Gabbriellini
simone.gabbriell...@gmail.com wrote:
 Dear List,

 I have a data structure like:

 aa
 [[1]]
 [1] 1 2 3

 [[2]]
 [1] 4 5 6

 bb
 [[1]]
 [1] 3

 [[2]]
 [1] 5

 I would like to set differences between aa and bb and get as
 result another list of lists like:

 res
 [[1]]
 [1] 1 2

 [[2]]
 [1] 4 6

 I am trying:

 lapply(aa, setdiff, bb)

 but I simply get aa back as result. Could you please point me in the
 right direction about the combination of functions that I need to
 accomplish this task?

 Best regards,
 Simone

 --
 Simone Gabbriellini, PhD

 PostDoc@DISI, University of Bologna
 mobile: +39 340 39 75 626
 email: simone.gabbriell...@unibo.it
 home: www.digitaldust.it

 DigitalBrains srl
 Amministratore
 mobile: +39 340 39 75 626
 email: simone.gabbriell...@digitalbrains.it
 home: www.digitalbrains.it

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to

2013-02-01 Thread Simone Gabbriellini
Hi,

Yes I know it works, but I'd like to assign the results like this:

V(g)$meanknn - ONELINER

Where V(g) elencates all the nodes in my graph...

Thanks,
Simone

Sent from my iPhone. Please excuse brevity and odd typos

On 01/feb/2013, at 13:31, PIKAL Petr petr.pi...@precheza.cz wrote:

 Hi
 
 I do not see any problem with your code. *apply functions are also hidden 
 cycles and there shall not be substantial improvement in speed.
 
 Why you do not want to use for cycle?
 
 Petr
 
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Simone Gabbriellini
 Sent: Friday, February 01, 2013 12:54 PM
 To: r-help@r-project.org
 Subject: [R] how to
 
 Dear List,
 
 I have a list of lists and on each list I want to apply a function from
 the igraph package, graph.knn. I need to calculate graph.knn for each
 list on a graph g, then retrieve one of the result called $knn and
 calculate its mean. The non-R style code looks like this:
 
 for(i in listOfLists){
print(mean(graph.knn(g, i)$knn))
 }
 
 Is there any way to convert this in a one-liner? I have tried to figure
 it out with lapply() or mapply() but with no success.
 
 Thank you in advance for your help.
 
 Simone
 
 --
 Simone Gabbriellini, PhD
 
 PostDoc@DISI, University of Bologna
 mobile: +39 340 39 75 626
 email: simone.gabbriell...@unibo.it
 home: www.digitaldust.it
 
 DigitalBrains srl
 Amministratore
 mobile: +39 340 39 75 626
 email: simone.gabbriell...@digitalbrains.it
 home: www.digitalbrains.it
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Converting Date to Unix Time

2013-02-01 Thread Lorenzo Isella
Dear All,
I am struggling with a simple problem. I did some online research, but
I am still banging my head against the floor.
I read a csv file into a data frame and I have a columns with entries like

2012-04-13 00:23:45
2012-04-22 07:53:09


etc..
Each line posted above is the content of a single cell.
How can I convert that into Unix time?
Many thanks

Lorenzo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transforming 4x3 data frame into 2 column df in R

2013-02-01 Thread nalluri pratap
newdf-data.frame(k1=rep(row.names(foo),ncol(foo)),stack(foo))

--- On Fri, 1/2/13, Gundala Viswanath gunda...@gmail.com wrote:


From: Gundala Viswanath gunda...@gmail.com
Subject: [R] Transforming 4x3 data frame into 2 column df in R
To: r-h...@stat.math.ethz.ch r-h...@stat.math.ethz.ch
Date: Friday, 1 February, 2013, 11:30 AM


I have the following data frame:

 foo
           w         x         y         z
n 1.51550092 1.4337572 1.2791624 1.1771230
q 0.09977303 0.8173761 1.6123402 0.1510737
r 1.17083866 1.2469347 0.8712135 0.8488029

What I want to do is to change it into :

 newdf
1     n    w 1.51550092
2     q   w 0.09977303
3     r   w 1.17083866
4     n    x 1.43375725
5     q   x 0.81737606
6     r   x 1.24693468
7     n   y 1.27916241
8     q   y 1.61234016
9     r   y 0.87121353
10   n    z 1.17712302
11   q   z 0.15107369
12   r    z 0.84880292

Whtat's the way to do it?

- G.V.

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How does this function print, why is n1 which equals 1 printed as 2?

2013-02-01 Thread John Sorkin
Windows 7, R 2.12.1
Colleagues,
I am trying to understand the n.for.2means function. The code below is a copy 
of the function (renamed to n.for.2means.js). I have inserted a single line of 
code towards the bottom of the function which uses the cat function to print 
the value of n1. You will note the value (preceded by stars) is printed as 1.
The function (1) prints a lot of output without any instructions in the 
function to print anything (other than the cat statement I added), and when it 
prints (2) reports the value of n1 to be 2!.
I have two questions, (i) how is the function printing when there is no code to 
print and (ii) how is n1 which equals 1 being reported as 2? I suspect there is 
something fundamental about R that I don't know.
Thank you for the help.
John
 
 
library(epicalc)
n.for.2means.js - function (mu1, mu2, sd1, sd2, ratio = 1, alpha = 0.05, power 
= 0.8) 
{
  n1 - (sd1^2 + sd2^2/ratio) * (qnorm(1 - alpha/2) - qnorm(1 -  power))^2/(mu1 
- mu2)^2
  n1 - round(n1)
  n2 - ratio * n1
  if (length(alpha) == 1) {
alpha1 - NULL
  }
  else {
alpha1 - alpha
  }
  if (length(power) == 1) {
power1 - NULL
  }
  else {
power1 - power
  }
  if (length(ratio) == 1) {
ratio1 - NULL
  }
  else {
ratio1 - ratio
  }
  table1 - cbind(mu1, mu2, sd1, sd2, n1, n2, alpha1, power1, 
  ratio1)
  colnames(table1)[colnames(table1) == alpha1] - alpha
  colnames(table1)[colnames(table1) == power1] - power
  colnames(table1)[colnames(table1) == ratio1] - n2/n1
  table1 - as.data.frame(table1)
  cat(**n1***=,n1,\n)
  returns - list(mu1 = mu1, mu2 = mu2, sd1 = sd1, sd2 = sd2, 
  alpha = alpha, n1 = n1, n2 = n2, power = power, ratio = 
ratio, 
  table = table1)
  class(returns) - c(n.for.2means, list)
  returns
}
n.for.2means.js(0,8,10/6,10/6)
 
 
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information.  
Any unauthorized use, disclosure or distribution is prohibited.  If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] expand.grid on contents of a list

2013-02-01 Thread Dimitri Liakhovitski
Hello!
I have a list of variable length. One example is:
X=vector(list,3)
X[[1]]=1:2
X[[2]]=1:2
X[[3]]=1:2
How could I run expand.grid on the elements of X so that the results would
be the same as expand.grid(1:2,1:2,1:2)?

Thank you!
Dimitri

-- 
Dimitri Liakhovitski
gfk.com http://marketfusionanalytics.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting Date to Unix Time

2013-02-01 Thread Hasan Diwan
Mr Isella,

On 1 February 2013 05:37, Lorenzo Isella lorenzo.ise...@gmail.com wrote:
 How can I convert that into Unix time?

format.POSIXct(dateCol, '%s'); -- H
-- 
Sent from my mobile device
Envoyait de mon portable

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How does this function print, why is n1 which equals 1 printed as 2?

2013-02-01 Thread Duncan Murdoch

On 13-02-01 8:47 AM, John Sorkin wrote:

Windows 7, R 2.12.1
Colleagues,
I am trying to understand the n.for.2means function. The code below is a copy 
of the function (renamed to n.for.2means.js). I have inserted a single line of 
code towards the bottom of the function which uses the cat function to print 
the value of n1. You will note the value (preceded by stars) is printed as 1.
The function (1) prints a lot of output without any instructions in the 
function to print anything (other than the cat statement I added), and when it 
prints (2) reports the value of n1 to be 2!.
I have two questions, (i) how is the function printing when there is no code to 
print and (ii) how is n1 which equals 1 being reported as 2? I suspect there is 
something fundamental about R that I don't know.


I haven't run the code, but presumably it's just the usual auto 
printing.  Unless a function sets the result to be invisible, results

of functions are printed by calling print() after they are returned.

Since your function returns something with class

c(n.for.2means, list)

R will look for a print method for those classes, and use it to print 
the result.  There's no print.list method in standard R; it just uses 
the default.  But there's probably a print.n.for.2means function in the 
package (possibly not exported; you might need 
epicalc:::print.n.for.2means to see it).  There might be a print.list 
function there instead or as well.


Duncan Murdoch


Thank you for the help.
John


library(epicalc)
n.for.2means.js - function (mu1, mu2, sd1, sd2, ratio = 1, alpha = 0.05, power 
= 0.8)
{
   n1 - (sd1^2 + sd2^2/ratio) * (qnorm(1 - alpha/2) - qnorm(1 -  
power))^2/(mu1 - mu2)^2
   n1 - round(n1)
   n2 - ratio * n1
   if (length(alpha) == 1) {
 alpha1 - NULL
   }
   else {
 alpha1 - alpha
   }
   if (length(power) == 1) {
 power1 - NULL
   }
   else {
 power1 - power
   }
   if (length(ratio) == 1) {
 ratio1 - NULL
   }
   else {
 ratio1 - ratio
   }
   table1 - cbind(mu1, mu2, sd1, sd2, n1, n2, alpha1, power1,
   ratio1)
   colnames(table1)[colnames(table1) == alpha1] - alpha
   colnames(table1)[colnames(table1) == power1] - power
   colnames(table1)[colnames(table1) == ratio1] - n2/n1
   table1 - as.data.frame(table1)
   cat(**n1***=,n1,\n)
   returns - list(mu1 = mu1, mu2 = mu2, sd1 = sd1, sd2 = sd2,
   alpha = alpha, n1 = n1, n2 = n2, power = power, ratio = 
ratio,
   table = table1)
   class(returns) - c(n.for.2means, list)
   returns
}
n.for.2means.js(0,8,10/6,10/6)


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information.  
Any unauthorized use, disclosure or distribution is prohibited.  If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to

2013-02-01 Thread PIKAL Petr
Hi

So why you do not assign it inside a cycle? As I said lapply does the same as 
for cycle does (maybe in some more compact manner).

Regards
Petr

 -Original Message-
 From: Simone Gabbriellini [mailto:simone.gabbriell...@gmail.com]
 Sent: Friday, February 01, 2013 2:29 PM
 To: PIKAL Petr
 Cc: r-help@r-project.org
 Subject: Re: [R] how to
 
 Hi,
 
 Yes I know it works, but I'd like to assign the results like this:
 
 V(g)$meanknn - ONELINER
 
 Where V(g) elencates all the nodes in my graph...
 
 Thanks,
 Simone
 
 Sent from my iPhone. Please excuse brevity and odd typos
 
 On 01/feb/2013, at 13:31, PIKAL Petr petr.pi...@precheza.cz wrote:
 
  Hi
 
  I do not see any problem with your code. *apply functions are also
 hidden cycles and there shall not be substantial improvement in speed.
 
  Why you do not want to use for cycle?
 
  Petr
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
  project.org] On Behalf Of Simone Gabbriellini
  Sent: Friday, February 01, 2013 12:54 PM
  To: r-help@r-project.org
  Subject: [R] how to
 
  Dear List,
 
  I have a list of lists and on each list I want to apply a function
  from the igraph package, graph.knn. I need to calculate graph.knn
 for
  each list on a graph g, then retrieve one of the result called $knn
  and calculate its mean. The non-R style code looks like this:
 
  for(i in listOfLists){
 print(mean(graph.knn(g, i)$knn))
  }
 
  Is there any way to convert this in a one-liner? I have tried to
  figure it out with lapply() or mapply() but with no success.
 
  Thank you in advance for your help.
 
  Simone
 
  --
  Simone Gabbriellini, PhD
 
  PostDoc@DISI, University of Bologna
  mobile: +39 340 39 75 626
  email: simone.gabbriell...@unibo.it
  home: www.digitaldust.it
 
  DigitalBrains srl
  Amministratore
  mobile: +39 340 39 75 626
  email: simone.gabbriell...@digitalbrains.it
  home: www.digitalbrains.it
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html and provide commented, minimal, self-contained,
  reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] expand.grid on contents of a list

2013-02-01 Thread Dimitri Liakhovitski
Oops, it was easier than I thought:
expand.grid(X)
Dimitri

On Fri, Feb 1, 2013 at 8:48 AM, Dimitri Liakhovitski 
dimitri.liakhovit...@gmail.com wrote:

 Hello!
 I have a list of variable length. One example is:
 X=vector(list,3)
 X[[1]]=1:2
 X[[2]]=1:2
 X[[3]]=1:2
 How could I run expand.grid on the elements of X so that the results would
 be the same as expand.grid(1:2,1:2,1:2)?

 Thank you!
 Dimitri

 --
 Dimitri Liakhovitski
 gfk.com http://marketfusionanalytics.com/




-- 
Dimitri Liakhovitski
gfk.com http://marketfusionanalytics.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How does this function print, why is n1 which equals 1 printed as 2?

2013-02-01 Thread David Winsemius


On Feb 1, 2013, at 7:47 AM, John Sorkin wrote:


Windows 7, R 2.12.1
Colleagues,
I am trying to understand the n.for.2means function. The code below  
is a copy of the function (renamed to n.for.2means.js). I have  
inserted a single line of code towards the bottom of the function  
which uses the cat function to print the value of n1. You will note  
the value (preceded by stars) is printed as 1.
The function (1) prints a lot of output without any instructions in  
the function to print anything (other than the cat statement I  
added), and when it prints (2) reports the value of n1 to be 2!.
I have two questions, (i) how is the function printing when there is  
no code to print and (ii) how is n1 which equals 1 being reported as  
2? I suspect there is something fundamental about R that I don't know.


1) on my machine the output from the cat() call is:

**n1***= 1

2) All of the output without any instructions in the function to  
print anything  is just the value of the list object from the  
function. Unless you return values using the `invisible` function, any  
user define function executed at the console will print its value.  
That is standard interactive session behavior. So one gets after the  
the cat output:


$mu1
[1] 0

$mu2
[1] 8

$sd1
[1] 1.67

$sd2
[1] 1.67

$alpha
[1] 0.05

$n1
[1] 1

$n2
[1] 1

$power
[1] 0.8

$ratio
[1] 1

$table
  mu1 mu2  sd1  sd2 n1 n2
1   0   8 1.67 1.67  1  1

attr(,class)
[1] n.for.2means list


--
David.


Thank you for the help.
John


library(epicalc)
n.for.2means.js - function (mu1, mu2, sd1, sd2, ratio = 1, alpha =  
0.05, power = 0.8)

{
 n1 - (sd1^2 + sd2^2/ratio) * (qnorm(1 - alpha/2) - qnorm(1 -   
power))^2/(mu1 - mu2)^2

 n1 - round(n1)
 n2 - ratio * n1
 if (length(alpha) == 1) {
   alpha1 - NULL
 }
 else {
   alpha1 - alpha
 }
 if (length(power) == 1) {
   power1 - NULL
 }
 else {
   power1 - power
 }
 if (length(ratio) == 1) {
   ratio1 - NULL
 }
 else {
   ratio1 - ratio
 }
 table1 - cbind(mu1, mu2, sd1, sd2, n1, n2, alpha1, power1,
 ratio1)
 colnames(table1)[colnames(table1) == alpha1] - alpha
 colnames(table1)[colnames(table1) == power1] - power
 colnames(table1)[colnames(table1) == ratio1] - n2/n1
 table1 - as.data.frame(table1)
 cat(**n1***=,n1,\n)
 returns - list(mu1 = mu1, mu2 = mu2, sd1 = sd1, sd2 = sd2,
 alpha = alpha, n1 = n1, n2 = n2, power = power,  
ratio = ratio,

 table = table1)
 class(returns) - c(n.for.2means, list)
 returns
}
n.for.2means.js(0,8,10/6,10/6)


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Confidentiality Statement:
This email message, including any attachments, is for ...{{dropped:15}}


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help on proportions

2013-02-01 Thread James C. Whanger
Hello Srini,

It sounds as if you are attempting to establish a prior probability and
compare it to the posterior probability -- a perfect candidate for bayesian
analysis. I would simply do a search for 'bayesian analysis of gene
expression data'  -- there are a number of statistical packages that are
available.  A number of R packages are available as well as a software
package from Yale: http://www.yale.edu/townsend/Software/BAGELTutorial.html

Hope this helps,

James


On Thu, Jan 31, 2013 at 11:11 PM, Srinivas Iyyer
srini_iyyer_...@yahoo.comwrote:

 Hi:
 Apologies for asking the following question. As this may sound very basic
 and stupid for this forum , I honestly do not know how to solve it and I do
 not have a teacher who can help me understand.

 I have list of genes (200) that are involved in a particular process and I
 call this as a ProcSet.   From an independent experiment I found that out
 of 10,000 genes, 1500 are significant and I call these1500 genes as
 ResultSet.

 The intersection of ResultSet and ProcSet are 80 genes.

 That means 40% of ProcSet are significant.

  How do I calculate that 40% is significant and more than I expect by
 chance given ResultSet and 10,000 genes I evaluated in the experiment.

 What I have:
 n = 200 (ProcSet)
 p = 0.4

 N = 1500  (ResultSet)

 N1 =10,000

 Pn = 0.15

 What kind of test will help me know that 0.4 is significant given 0.15.
 Any suggestions will greatly help me.

 Thank you.
 Srini

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
*James C. Whanger*
*
*
*It ain't what you don't know that gets you into trouble. It's what you
know for sure that just ain't so.  Mark Twain*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] expand.grid on contents of a list

2013-02-01 Thread arun
Hi,
expand.grid(X)  
#  Var1 Var2 Var3 
#1111 
#2211 
#3121 
#4221 
#5112 
#6212 
#7122 
#8222  expand.grid(1:2,1:2,1:2) 
#  Var1 Var2 Var3 
#1111 
#2211 
#3121 
#4221 
#5112 
#6212 
#122 
#8222 
A.K.

- Original Message -
From: Dimitri Liakhovitski dimitri.liakhovit...@gmail.com
To: r-help r-help@r-project.org
Cc: 
Sent: Friday, February 1, 2013 8:48 AM
Subject: [R] expand.grid on contents of a list

Hello!
I have a list of variable length. One example is:
X=vector(list,3)
X[[1]]=1:2
X[[2]]=1:2
X[[3]]=1:2
How could I run expand.grid on the elements of X so that the results would
be the same as expand.grid(1:2,1:2,1:2)?

Thank you!
Dimitri

-- 
Dimitri Liakhovitski
gfk.com http://marketfusionanalytics.com/

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] obtainl survival curves for single strata

2013-02-01 Thread Terry Therneau

Stephen,
   I can almost but not quite get my arms around what you are asking.  A bit more detail 
would help.  Like

 cph.approve = what kind of object, generated by function __

But, let me make a guess:
   cph is the result of coxph, and the model has both covariates and a strata
   You want predictioned survival curves, more than 1, of the type covariates = a, b,c, 
strata=1  covariates = d,e, f, strata=2, ... for arbitrary covariates and strata.


Now, the predicted survival curves for different strata are different lengths.
The survfit.coxph routine gets around this by being verbose: it expects you to give it 
covariate sets, and returns
all of the strata for each covariate. This allows it to give a compact result.   You can 
always do:

   newpred - survfit(cox-model-fit, newdata=something)
   newpred[5,17] #  survival curve for the 5th strata, covariates from the 17th row of 
newdata


But, you want to put the results into a matrix, for some pre-specifed set of time points.  
Take it one step further.

mytimepoints - seq(0, 5*365, by=30)  # every 30 days
z - summary(newpred[5,17], time=mytimepoints, extend=TRUE)$surv

The summary.survfit function's time argument was originally written for people who only 
wanted to print certain time points, but it works as well for those who only want to 
extract certain ones.  It correctly handles the fact that the curve is a step function.


Terry Therneau


On 02/01/2013 05:00 AM, r-help-requ...@r-project.org wrote:

What is the syntax to obtain survival curves for single strata on many subjects?

I have a model based on Surv(time,response) object, so there is a single row 
per subject and no start,stop and no switching of strata.

The newdata has many subjects and each subject has a strata and the survival 
based on the subject risk and the subject strata is needed.

If I do

newpred- survfit(cph.approve,new=newapp,se=F)

I get all strata for every subject.

Attempting


  newpred- survfit(cph.approve,new=newapp,id=CertId,se=F)

Error in survfit.coxph(cph.approve, new = newapp, id = CertId, se = F) :
   The individual option is  only valid for start-stop data

  newpred- survfit(cph.approve,new=newapp,indi=T,se=F)

Error in survfit.coxph(cph.approve, new = newapp, indi = T, se = F) :
   The individual option is  only valid for start-stop data

Please, advise if obtaining a single strata for a basic (time,response) model is 
possible? Due to differing lengths of the surv for different strata this will not go in a 
wide data.frame without padding.

Thanks everybody and have a great day.

Stephen B



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question on plotCI function

2013-02-01 Thread li li
Thanks so much for the reply, Ista. I used plotrix library.

Here is my example:

xx - seq(0.05, 0.95, by=0.05)
lower - c(-2.896865, -2.728416, -2.642574, -2.587724, -2.548672, -2.518994,
-2.495409, -2.476031, -2.459662, -2.445513, -2.433014, -2.421739, -2.411344,
-2.401536, -2.392040, -2.382571, -2.372786, -2.362198, -2.349891)
upper - c(2.311539, 2.372006, 2.423280, 2.469220, 2.511851, 2.552421,
 2.591797, 2.630657, 2.669579, 2.709135, 2.749928, 2.792670, 2.838268,
 2.887976, 2.943683, 3.008502, 3.088240, 3.195954, 3.373528)

library(plotrix)
plotCI(xx,ui=upper,li=lower,err=y,pch=NA, xlab=, ylab=, ylim=c(-5,
5),  slty=solid,scol=blue, lwd=2)


My question is how can I change the xllim to be c(0,1) which corresponds to
the xx values.

   Hanna
2013/2/1 Ista Zahn istaz...@gmail.com

 There are many plotCI functions in many different packages... which
 one are you referring to? Also please construct a reproducible example
 illustrating your problem.

 Best,
 Ista

 On Thu, Jan 31, 2013 at 11:58 PM, li li hannah@gmail.com wrote:
  Hi all,
 In my plotCI function, the argument x is chosen to be seq(0.05, 0.95,
  by=0.05).
  However, when I make the plot, the plot has the x coordinate goes  1:19.
 Does anyone know how to make the x coordinate to be (0,0.5, 0.1, ...,
  0.95).
 Thank you.
Hanna
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ks.test and wilcoxon.test results differ from other stat.packages

2013-02-01 Thread Rui Barradas

Hello,

As for the KS test, the op might also want to look at

Simard, L'Ecuyer (2011)
http://www.jstatsoft.org/v39/i11

for an account on the different algorithms available, and their accuracy.
Apparently, R uses different algorithms according to the test in 
question. See the details section of ?ks.test. See also the references 
there, such as


Marsaglia, Tsang, Wang (2003)
http://www.jstatsoft.org/v08/i18/

Hope this helps,

Rui Barradas

Em 01-02-2013 11:24, Meyners, Michael escreveu:

Impossible to say w/o a reproducible example, but to start with let me suggest 
looking at the exact= (both functions) and correct= (wilcox.test) arguments. 
Experience shows that some change of the default settings allows you to 
reproduce results from other software (and the help pages will explain you what 
the settings do; by that, you might also learn what your other tools actually 
calculate if you have no documentation at hand).

HTH, Michael


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
project.org] On Behalf Of Alexander Favorov
Sent: Freitag, 1. Februar 2013 09:00
To: r-help@r-project.org
Subject: [R] ks.test and wilcoxon.test results differ from other
stat.packages

Probably, it's an obvious info, but I have not found anything in R FAQ
concerning this feature/bug.

The results of ks.test and wilcoxon.test (in the Mann-Whitney version,
paired = 'FALSE') don't coincide with the results from the other
statistical packages, e.g. Statistica, Medcalc, and (as for MW test)
from the numerous online MW tests.

E.g.
Statistica p-value=0.0435353
Medcalc p-value=0.0435354
R p-value=0.04635

If I want to obtain result of test once, it does not matter.

But what should I do if I want to perform Monte-Carlo simulations and I
need in 1 or even 100 000 p-values and then will build some
distribution and then use results of Statictica? Whtever, the
descrepancy bothers.

Are there some alternative packages for non-parametric statistics that
produce results comparable with the other program packages? Or,
probably, there is exists some environment variable to perform
calculations in more common way?

Thank you!

Sasha.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Relative Risk in logistic regression

2013-02-01 Thread Ivan-K
Dear colleagues, I have 2 points: One opinion and one question.

1)
In one paper in a peer-reviewed journal, I read about the idea of using a
logit regression as a surrogate for the log-binomial, just adding the
numerator to the denominator ... 
It’s tempting to immediately get the RR instead of OR ...
I tried it and I think it's a bad idea, the confidence intervals
dramatically inflated!
Any opinions?

2) 
What would be the criteria for selection of link - functions for binary
data?
Usually I use the logit - just for simplest interpretation of parameters.
Using logit, probit, log-log, and log-log, I get identical values of the
maximum-likelihood, Pearson statistics, overdispersion parameter, etc.
However, the regression coefficients and its standard errors are different
(for logit b is the maximum, for the probit – min., for log-log  C- log-log
are between them). LRs close, but the maximum has the log-log. Wald
criterion - the maximum for the probit.
?What are the interpretations for regression parameters (except logit)???

Ivan, IPAE RAS




--
View this message in context: 
http://r.789695.n4.nabble.com/Relative-Risk-in-logistic-regression-tp4657040p4657297.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question on plotCI function

2013-02-01 Thread MacQueen, Don
You haven't given any y values to plotCI.

Try, for example,

plotCI(xx, (lower+upper)/2, ui=upper, (etc)

What you got was in effect
  plot( seq(along=xx), xx )
which is standard behavior for plot() when no y values are supplied.

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 2/1/13 7:21 AM, li li hannah@gmail.com wrote:

Thanks so much for the reply, Ista. I used plotrix library.

Here is my example:

xx - seq(0.05, 0.95, by=0.05)
lower - c(-2.896865, -2.728416, -2.642574, -2.587724, -2.548672,
-2.518994,
-2.495409, -2.476031, -2.459662, -2.445513, -2.433014, -2.421739,
-2.411344,
-2.401536, -2.392040, -2.382571, -2.372786, -2.362198, -2.349891)
upper - c(2.311539, 2.372006, 2.423280, 2.469220, 2.511851, 2.552421,
 2.591797, 2.630657, 2.669579, 2.709135, 2.749928, 2.792670, 2.838268,
 2.887976, 2.943683, 3.008502, 3.088240, 3.195954, 3.373528)

library(plotrix)
plotCI(xx,ui=upper,li=lower,err=y,pch=NA, xlab=, ylab=, ylim=c(-5,
5),  slty=solid,scol=blue, lwd=2)


My question is how can I change the xllim to be c(0,1) which corresponds
to
the xx values.

   Hanna
2013/2/1 Ista Zahn istaz...@gmail.com

 There are many plotCI functions in many different packages... which
 one are you referring to? Also please construct a reproducible example
 illustrating your problem.

 Best,
 Ista

 On Thu, Jan 31, 2013 at 11:58 PM, li li hannah@gmail.com wrote:
  Hi all,
 In my plotCI function, the argument x is chosen to be seq(0.05,
0.95,
  by=0.05).
  However, when I make the plot, the plot has the x coordinate goes
1:19.
 Does anyone know how to make the x coordinate to be (0,0.5, 0.1,
...,
  0.95).
 Thank you.
Hanna
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 
http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/post
ing-guide.html
  and provide commented, minimal, self-contained, reproducible code.


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summary of data for each year

2013-02-01 Thread Felipe Carrillo
 Here is another option using plyr:
 
library(plyr)
creek - read.csv(creek.csv)
 library(ggplot2)
 creek[1:10,]
 colnames(creek) - c(date,flow)
 creek$date - as.Date(creek$date, %m/%d/%Y)
 
ddply(creek,year,summarise,MED=median(flow),MEAN=mean(flow),SD=sd(flow),MIN=min(flow))

Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA
http://www.fws.gov/redbluff/rbdd_jsmp.aspx




From: Pascal Oettli kri...@ymail.com
To: Janesh Devkota janesh.devk...@gmail.com 
Cc: r-help@r-project.org 
Sent: Thursday, January 31, 2013 11:52 PM
Subject: Re: [R] Summary of data for each year

Hello,

One possibility is:

 creek - read.csv(creek.csv)
 colnames(creek) - c(date,flow)
 creek$date - as.Date(creek$date, %m/%d/%Y)
 creek - within(creek, year - format(date, '%Y'))

 with(creek, aggregate(flow, by=list(year=year), summary))

HTH,
Pascal


Le 01/02/2013 16:32, Janesh Devkota a écrit :
 Hello All,

 I have a data with two columns. In one column it is date and in another
 column it is flow data.

 I was able to read the data as date and flow data. I used the following
 code:

 creek - read.csv(creek.csv)
 library(ggplot2)
 creek[1:10,]
 colnames(creek) - c(date,flow)
 creek$date - as.Date(creek$date, %m/%d/%Y)

 The link to my data is https://www.dropbox.com/s/eqpena3nk82x67e/creek.csv

 Now, I want to find the summary of each year. I want to especially know
 mean, median, maximum etc.

 Thanks.

 Janesh

     [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Range difference of plot two arrays in one plot

2013-02-01 Thread Yuan, Rebecca
Hello all,

When I tried to plot the following two arrays in one figure with the following:

x = c(0,0,0,10,20,30)
y = c(40,50,60,70,80,90)
plot(x, type='o', ylim=c(min(x),max(x)))
par(new=T)
plot(y, type='l', ylim=c(min(y),max(y)))

Found that the first points and last points from those two arrays are 
overlapping together, but the value 30 is not equal to 90. How could I draw 
those two arrays in one plot with their real values shown in the plot? i.e., x 
is in the lower part of the plot and y is in the upper part of the plot.

Thanks very much!

Cheers,

Rebecca



--
This message, and any attachments, is for the intended r...{{dropped:5}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Version 1.2 of pa package available on CRAN

2013-02-01 Thread Yang Lu
Dear useRs,

Version 1.2 of the R package 'pa' is now available
on CRAN. This package provides tools for conducting
equity portfolio attribution. Two methods included
in the package are the Brinson Method and a
regression-based analysis. The new version allows
users to view attribution results for sub-categories
based on the Brinson Method.

URL –
http://cran.at.r-project.org/web/packages/pa/index.html

It also includes a vignette at
http://cran.at.r-project.org/web/packages/pa/vignettes/pa.pdf.

Best regards,

Yang Lu

Yang Lu '14
SU 2896 Paresky Center
Williams College, MA
(413)884-4847

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filter according to the latest data

2013-02-01 Thread nalluri pratap
library(sqldf)

k1-data.frame(ID=LETTERS[1:4],
No=c(rep(123,3),111),
Change=c(final,error,bug fixed,final),
Date=c(2013-01-15,2013-01-16,2013-01-17,2013-01-12))
 
k1$Date=as.Date(as.character(k1$Date),tz=UK)
 
sqldf(select *
from k1
group by No
having max(Date))


--- On Fri, 1/2/13, Mat matthias.we...@fnt.de wrote:


From: Mat matthias.we...@fnt.de
Subject: [R] Filter according to the latest data
To: r-help@r-project.org
Date: Friday, 1 February, 2013, 1:34 PM


Hello together,

i have a data.frame, like this one:
                 No.          Change           Date          
A              123           final                2013-01-15
B              123           error               2013-01-16
C              123           bug fixed       2013-01-17
D              111           final                2013-01-12

and now a want a new data.frame which includes only the newest entry for
each number.
The solution look like this:
                 No.          Change           Date          
C              123           bug fixed       2013-01-17
D              111           final                2013-01-12

is there any way to filter my data.frame to the latest data, perhabs max?

Thanks.

Mat



--
View this message in context: 
http://r.789695.n4.nabble.com/Filter-according-to-the-latest-data-tp4657248.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error: package/namespace load failed for ‘arm’

2013-02-01 Thread Pernille Heelsberg
Hi!
I get the following error messages when trying to run the package 'arm'.

 library(arm)
Error : Functions found when exporting methods from the namespace ‘arm’
which are not S4 generic: ‘fixef’, ‘ranef’
Error: package/namespace load failed for ‘arm’

My R version is:
 version
   _
platform   i386-w64-mingw32
arch   i386
os mingw32
system i386, mingw32
status
major  2
minor  15.2
year   2012
month  10
day26
svn rev61015
language   R
version.string R version 2.15.2 (2012-10-26)
nickname   Trick or Treat

When running the install for 'arm' it looks like the install was successful
and 'arm' shows up in my package list and has created a folder in the
library (C:\Users\Pernille\Documents\R\win-library\2.15\arm)

 install.packages(arm)
Installing package(s) into ‘C:/Users/Pernille/Documents/R/win-library/2.15’
(as ‘lib’ is unspecified)
trying URL '
http://cran.rstudio.com/bin/windows/contrib/2.15/arm_1.6-01.02.zip'
Content type 'application/zip' length 237391 bytes (231 Kb)
opened URL
downloaded 231 Kb

package ‘arm’ successfully unpacked and MD5 sums checked

I have tried uninstalling and reinstalling 'arm'. I recently updated R from
version 2.15.0 to 2.15.2 but 'arm' ran with the same error messages in both
versions. It has never worked on my computer. I have tried in both 64 and
32 bit R.

Hope someone can help me, I much appreciate it!

Pernille Heelsberg

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] x-axis labelling

2013-02-01 Thread Alfredo Tello
Hi Folks!

I have a question regarding an issue on which I´ve read a few threads but
can´t resolve -- Labelling every other tick mark on the x-axis using
factors. Here´s an example:

db-data.frame(date=as.factor(c(1:50)),var=rnorm(50))
barplot(db$var,names.arg=levels(db$date),las=2,cex.axis=0.8,cex=0.8)

This labels all the tickmarks in my barplot, which is the default behavior.
I would like to label every other tick mark in the barplot --
1,3,5,7 and so on.

Thank you very much for your help!


All the best,

A

-- 
Alfredo Tello (http://alfredotello.com)
Sustainable Aquaculture Group
Institute of Aquaculture
University of Stirling
Scotland, UK.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Are there a read and write functions for data in the vowpal_wabbit format?

2013-02-01 Thread Roy Lowrance
Has anyone implemented a reader and writer for data in the vowpal wabbit
format? It's a format for representing sparse data in ASCII files, see

https://github.com/JohnLangford/vowpal_wabbit


- Roy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filter according to the latest data

2013-02-01 Thread arun
Hi,
Perhaps, (#Untested)
do.call(rbind,lapply(split(dat1,dat1$No),function(x) tail(x,1)))

#or
library(plyr)
ddply(dat1,.(No), function(x) x[nrow(x),])

A.K.


- Original Message -
From: Mat matthias.we...@fnt.de
To: r-help@r-project.org
Cc: 
Sent: Friday, February 1, 2013 3:04 AM
Subject: [R] Filter according to the latest data

Hello together,

i have a data.frame, like this one:
                 No.          Change           Date          
A              123           final                2013-01-15
B              123           error               2013-01-16
C              123           bug fixed       2013-01-17
D              111           final                2013-01-12

and now a want a new data.frame which includes only the newest entry for
each number.
The solution look like this:
                 No.          Change           Date          
C              123           bug fixed       2013-01-17
D              111           final                2013-01-12

is there any way to filter my data.frame to the latest data, perhabs max?

Thanks.

Mat



--
View this message in context: 
http://r.789695.n4.nabble.com/Filter-according-to-the-latest-data-tp4657248.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Doing mixed stock analysis with predict function using DFA

2013-02-01 Thread Melanie Zoelck

Dear R Help Mailing List Members,

I have some baseline data (attached) on which I plan to run a DFA in R to find 
out the best predictor variables that best allow discrimination between the 
different stocks. I then need to apply this DFA function to the mixed data set 
(attached), which contains all the predictor variables of the baseline data 
set, but does not contain a stock variable, as their origin is unknown. I am 
trying to classify the mixed stock fish back to their origin stock using the 
DFA function generated from the baseline data. My question is whether it is 
possible to use the predict function with the generated DFA function and 
provide the mixed data set as the newdata argument within this to generate the 
classification, even though the mixed data set does not contain the stock 
variable? 

Unfortunately, the mixstock package that has recently become available is not 
capable of dealing with continuous variables at this time, so I am unable to 
use it.

I would be grateful for any guidance you may be able to provide.

Thank you,

Melanie

Melanie Zölck (Zoelck)
PhD Candidate
Galway-Mayo Institute of Technology
Marine and Freshwater Research Centre
Commercial Fisheries Research Group
Department of Life Science
Dublin Road
Galway
Republic of Ireland
E-mail: mzoe...@hotmaill.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] use name (not values!) of a dataframe inside a funktion

2013-02-01 Thread Greg Snow
It is strongly discouraged in R to have functions that change data values
in the global workspace (or any location other than their local
environment).

The usual procedure in R is to have your function return a modified version
of the object and the user then decides what to do with it.  They can
assign it back to the same original object so that there is still only one
copy and it has changed (but the user made that decision, not the
programmer), or they can save it to a different name and not lose the
original.

If you really want to change the original copy (and there are sometimes
when the exception to the rule makes sense) then you can either use
environments (which don't copy on modify) or use macros instead of
functions.  Given your examples I would look at the macro approach first.
 There is a 'defmacro' function in the 'gtools' package and the reference
on the help page for 'defmacro' leads to the original R news (now R
Journal) article describing the use of macros in R (definitely read this if
you are considering this approach).


On Thu, Jan 31, 2013 at 7:34 AM, Winfried Moser winfried.mo...@gmail.comwrote:

 Dear Listers,

 can anyone help me, please.

 Since several days i try to figure out, how to assign values, vectors,
 functions etc to variables with dynamically generated names inside of
 functions.
 Sometimes I succeed, but the success is rather arbitrary, it seems. up to
 now i don't fully understand, why things like get, assign, - etc do
 sometimes work, and sometimes not.

 here's one of my daily examples, i am stuck with: Example 1 does work, but
 example 2 doesn't?
 How kann i tell R, that i want it to expand the string dfb to dfb[,2]
 inside the function.
 In the end i want the function to change the second variable of the
 dataframe dfb permanently to factor (not just inside the function).

 Thanks in advance!

 Winfried


 Example 1:
 dfa - data.frame(a=c(1:4),b=c(1:4))
 dfa[,2] - factor(dfa[,2])
 is.factor(dfa[,2])
 TRUE

 Example 2:
 dfb - data.frame(a=c(1:4),b=c(1:4))
 f.fact - function(x) {x[,2] - factor(x[,2])}
 f.fact(dfb)
 is.factor(dfb[,2])
 FALSE


 PS: I tried a whole lot of other things like, ...
 I really don't know where to keep on searching.


 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {get(x)[,2] - factor(get(x)[,2])}
 f.fact(dfb)
 is.factor(dfb[,2])
  Object 'x' nicht gefunden

 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {get(x[,2]) - factor(x[,2])}
 f.fact(dfb)
 is.factor(dfb[,2])
  Object 'x' nicht gefunden

 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {get(x)[,2] - factor(x[,2])}
 f.fact(dfb)
 is.factor(dfb[,2])
  Object 'x' nicht gefunden

 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {assign(x[,2], factor(x[,2]))}
 f.fact(dfb)
 is.factor(dfb[,2])
  Ungültiges erstes Argument

 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {quote(x)[,2], factor(x[,2])}
 f.fact(dfb)
 is.factor(dfb[,2])
  Unerwartetes ','

 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {
 name - paste0(quote(x),[,2])
  assign(name, factor(x[,2]))}
 f.fact(dfb)
 is.factor(dfb[,2])
  FALSE

 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {
 name - paste0(get(x),[,2])
 assign(name, factor(x[,2]))}
 f.fact(dfb)
 is.factor(dfb[,2])
  Falsche Anzahl von Dimensionen

 dfb - data.frame(a=c(1,2,3,4),b=c(1,2,3,4))
 f.fact - function(x) {
  name - paste0(x,[,2])
 assign(name, factor(x[,2]))}
 f.fact(dfb)
 is.factor(dfb[,2])
  Falsche Anzahl von Dimensionen

 ächz ...

 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm poisson and quasipoisson

2013-02-01 Thread Ivan-K
I am sure, that this is not a pure Poisson! Huge overdispersion!
You get inflated confidence intervals! 
(although, the point estimates of the regression coefficients stay the same)
Try to look for the causes of overdispersion! It may be geteroscedastisity?
What is the nature of the response, is it the positive integers?
Perhaps in your model still missing something important predictors?
Or just you can try the Gamma or log-normal distr.

Ivan, IPAE UB RAS




--
View this message in context: 
http://r.789695.n4.nabble.com/glm-poisson-and-quasipoisson-tp4657234p4657310.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Relative Risk in logistic regression

2013-02-01 Thread Greg Snow
Ivan,

In reference to your part 2), in 1989 Li and Duan published a paper where
they examined the effect of using the wrong link function (
http://projecteuclid.org/DPubS?service=UIversion=1.0verb=Displayhandle=euclid.aos/1176347254).
 The short version is that they found that in common models (lm and glm and
others) and when the x-variables meet certain conditions, then the
estimates of the slopes will change by a multiplicative constant as will
the variance co-variance matrix. Many of the tests are still well behaved
in this condition.  The link functions you mentioned are all very similar
to each other, so the impact of using a wrong one will be very minor.


On Fri, Feb 1, 2013 at 8:33 AM, Ivan-K k...@ipae.uran.ru wrote:

 Dear colleagues, I have 2 points: One opinion and one question.

 1)
 In one paper in a peer-reviewed journal, I read about the idea of using a
 logit regression as a surrogate for the log-binomial, just adding the
 numerator to the denominator ...
 It’s tempting to immediately get the RR instead of OR ...
 I tried it and I think it's a bad idea, the confidence intervals
 dramatically inflated!
 Any opinions?

 2)
 What would be the criteria for selection of link - functions for binary
 data?
 Usually I use the logit - just for simplest interpretation of parameters.
 Using logit, probit, log-log, and log-log, I get identical values of the
 maximum-likelihood, Pearson statistics, overdispersion parameter, etc.
 However, the regression coefficients and its standard errors are different
 (for logit b is the maximum, for the probit – min., for log-log  C-
 log-log
 are between them). LRs close, but the maximum has the log-log. Wald
 criterion - the maximum for the probit.
 ?What are the interpretations for regression parameters (except logit)???

 Ivan, IPAE RAS




 --
 View this message in context:
 http://r.789695.n4.nabble.com/Relative-Risk-in-logistic-regression-tp4657040p4657297.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm poisson and quasipoisson

2013-02-01 Thread ilai
On Thu, Jan 31, 2013 at 2:13 PM, Wim Kreinen wkrei...@gmail.com wrote:

 Hello,

 I have a question about modelling via  glm.


I think you are way off track. Either the data, glm, or both, are not what
you think they are.


 I have a dataset

skn300.tab - structure(list(n = 1:97, freq = c(0L, 0L, 0L, 0L, 1L, 7L, 40L,
100L, 276L, 543L, 952L, 1414L, 1853L, 2199L, 2435L, 2270L, 2042L,
1679L, 1386L, 1108L, 922L, 792L, 642L, 597L, 453L, 424L, 370L,
297L, 278L, 218L, 208L, 172L, 174L, 149L, 124L, 98L, 98L, 67L,
78L, 67L, 46L, 34L, 31L, 42L, 34L, 21L, 28L, 18L, 18L, 18L, 10L,
19L, 6L, 9L, 10L, 6L, 6L, 5L, 3L, 9L, 4L, 3L, 4L, 5L, 2L, 6L,
4L, 2L, 2L, 3L, 3L, 0L, 0L, 0L, 0L, 2L, 1L, 0L, 0L, 0L, 0L, 0L,
1L, 0L, 0L, 1L, 2L, 1L, 0L, 0L, 0L, 0L, 2L, 0L, 0L, 0L, 1L),
kum = c(0L, 0L, 0L, 0L, 1L, 8L, 48L, 148L, 424L, 967L, 1919L,
L, 5186L, 7385L, 9820L, 12090L, 14132L, 15811L, 17197L,
18305L, 19227L, 20019L, 20661L, 21258L, 21711L, 22135L, 22505L,
22802L, 23080L, 23298L, 23506L, 23678L, 23852L, 24001L, 24125L,
24223L, 24321L, 24388L, 24466L, 24533L, 24579L, 24613L, 24644L,
24686L, 24720L, 24741L, 24769L, 24787L, 24805L, 24823L, 24833L,
24852L, 24858L, 24867L, 24877L, 24883L, 24889L, 24894L, 24897L,
24906L, 24910L, 24913L, 24917L, 24922L, 24924L, 24930L, 24934L,
24936L, 24938L, 24941L, 24944L, 24944L, 24944L, 24944L, 24944L,
24946L, 24947L, 24947L, 24947L, 24947L, 24947L, 24947L, 24948L,
24948L, 24948L, 24949L, 24951L, 24952L, 24952L, 24952L, 24952L,
24952L, 24954L, 24954L, 24954L, 24954L, 24955L)), .Names = c(n,
freq, kum), row.names = c(NA, -97L), class = data.frame)


 that looks like as if it where poisson distributed (actually I would
 appreciate that) but it isnt


plot(skn300.tab)

My guess, we are looking at the pdf and cdf (maybe even of a Poisson
process), but not at any data that lends itself to a (generalized) linear
model. Consult a statistician, post on stackexchange, read about
regression, or better define your actual R problem here, demonstrating this
is not homework - see the posting guide.

Cheers




 because  mean unequals var.


  mean (x)
 [1] 901.7827
  var (x)
 [1] 132439.3


 Anyway, I tried to model it via poisson and quasipoisson. Actually, just to
 get an impression how glm works. But I dont know how to interprete the
 data. Of course this is the case because my knowledge concerning logistic
 regressions is rather limited. Hoping there is somebody with mercy I would
 like to understand which parameters are important, e.g. which paramter
 might give me a hint that a poisson model is a bad idea. For hints
 concerning some tutorials  about reading glm-output I would appreciate as
 well.

 Thanks
 Wim


  skn300.glmp - glm (freq~n, data=skn300.tab, family=poisson)
  summary (skn300.glmp)

 Call:
 glm(formula = freq ~ n, family = poisson, data = skn300.tab)

 Deviance Residuals:
 Min   1Q   Median   3Q  Max
 -51.332   -9.383   -6.599   -3.959   55.111

 Coefficients:
   Estimate Std. Error z value Pr(|z|)
 (Intercept)  7.2374375  0.0093285   775.8   2e-16 ***
 n   -0.0539424  0.0003699  -145.8   2e-16 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 (Dispersion parameter for poisson family taken to be 1)

 Null deviance: 71731  on 96  degrees of freedom
 Residual deviance: 37383  on 95  degrees of freedom
 AIC: 37800

 Number of Fisher Scoring iterations: 6

 
  skn300.glmq - glm (freq~n, data=skn300.tab, family=quasipoisson)
  summary (skn300.glmq)

 Call:
 glm(formula = freq ~ n, family = quasipoisson, data = skn300.tab)

 Deviance Residuals:
 Min   1Q   Median   3Q  Max
 -51.332   -9.383   -6.599   -3.959   55.111

 Coefficients:
  Estimate Std. Error t value Pr(|t|)
 (Intercept)  7.237438   0.186381  38.831   2e-16 ***
 n   -0.053942   0.007391  -7.298  8.8e-11 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 (Dispersion parameter for quasipoisson family taken to be 399.1874)

 Null deviance: 71731  on 96  degrees of freedom
 Residual deviance: 37383  on 95  degrees of freedom
 AIC: NA

 Number of Fisher Scoring iterations: 6


   dput (skn300.tab)
 structure(list(n = 1:97, freq = c(0L, 0L, 0L, 0L, 1L, 7L, 40L,
 100L, 276L, 543L, 952L, 1414L, 1853L, 2199L, 2435L, 2270L, 2042L,
 1679L, 1386L, 1108L, 922L, 792L, 642L, 597L, 453L, 424L, 370L,
 297L, 278L, 218L, 208L, 172L, 174L, 149L, 124L, 98L, 98L, 67L,
 78L, 67L, 46L, 34L, 31L, 42L, 34L, 21L, 28L, 18L, 18L, 18L, 10L,
 19L, 6L, 9L, 10L, 6L, 6L, 5L, 3L, 9L, 4L, 3L, 4L, 5L, 2L, 6L,
 4L, 2L, 2L, 3L, 3L, 0L, 0L, 0L, 0L, 2L, 1L, 0L, 0L, 0L, 0L, 0L,
 1L, 0L, 0L, 1L, 2L, 1L, 0L, 0L, 0L, 0L, 2L, 0L, 0L, 0L, 1L),
 kum = c(0L, 0L, 0L, 0L, 1L, 8L, 48L, 148L, 424L, 967L, 1919L,
 L, 5186L, 7385L, 9820L, 12090L, 14132L, 15811L, 17197L,
 18305L, 19227L, 20019L, 20661L, 21258L, 21711L, 22135L, 22505L,
 22802L, 23080L, 23298L, 23506L, 23678L, 23852L, 24001L, 

[R] Correcting and Adding to Data Frame

2013-02-01 Thread Rich Shepard

  Running on Slackware here with R-2.15.2. There is a data frame I need to
edit to correct mis-spellings and to add a row. I used the edit() command
with emacs but could not find the mis-spelled strings nor figure out how to
add another row.

  The data frame is stored in
/usr/lib/R/library/bio.infer/data/itis.ttable.rda

and I load it after invoking the bio.infer library.

  Running the bio.infer get.taxonomic() function brings up a dialog box
asking me to correct 5 entries; four are mis-spellings that I correct. When
I close the dialog box the function returns a message that one taxon is not
in the table:

 bcnt - get.taxonomic(emapben)
The following taxa are not in ITIS:
RADOTANYPUS

  Adding this taxon to the table seems to make no difference:

itis.ttable - rbind(itis.ttable, data.frame(PHYLUM = ARTHROPODA,
SUBPHYLUM = NA, SUPERCLASS = NA, CLASS = INSECTA, SUBCLASS = NA,
INFRACLASS = NA, SUPERORDER = NA, ORDER = DIPTERA, SUBORDER =
NEMATOCERA, INFRAORDER = CULICOMORPHA, SUPERFAMILY = CHIRONOMOIDAE,
FAMILY = CHIRONOMIDAE, SUBFAMILY = TANYPODINAE, TRIBE = NA, SUBTRIBE =
NA, GENUS = RADOTANYPUS, TAXON = NA))

bcnt - get.taxonomic(emapben)

The following taxa are not in ITIS:
RADOTANYPUS

  When I edit the names in the get.taxonomic() dialog box they seem to not
be corrected in the data frame as they also are displayed the second time.
And, using rbind() to add the missing row does not seem to take.

  What steps have I missed here, and how should I make these corrections and
the addition?

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Range difference of plot two arrays in one plot

2013-02-01 Thread Jeff Newmiller
One method is to follow correct usage of base graphics and only use the plot 
function once for each plot. To overlay, use a function like points or lines.

Another approach is to use lattice graphics or ggplot2 and give the data to 
higher-level plot routines that know how to plot multiple groups of data 
together.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Yuan, Rebecca rebecca.y...@bankofamerica.com wrote:

Hello all,

When I tried to plot the following two arrays in one figure with the
following:

x = c(0,0,0,10,20,30)
y = c(40,50,60,70,80,90)
plot(x, type='o', ylim=c(min(x),max(x)))
par(new=T)
plot(y, type='l', ylim=c(min(y),max(y)))

Found that the first points and last points from those two arrays are
overlapping together, but the value 30 is not equal to 90. How could I
draw those two arrays in one plot with their real values shown in the
plot? i.e., x is in the lower part of the plot and y is in the upper
part of the plot.

Thanks very much!

Cheers,

Rebecca



--
This message, and any attachments, is for the intended
r...{{dropped:5}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Range difference of plot two arrays in one plot

2013-02-01 Thread Yuan, Rebecca
Hello David,

When I used the same ylim for both plots, I got what I need:

x = c(0,0,0,10,20,30)
y = c(40,50,60,70,80,90)
plot(x, type='o', ylim=c(min(x,y),max(x,y)))
par(new=T)
plot(y, type='l', ylim=c(min(x,y),max(x,y)))

Thanks,

Rebecca

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Friday, February 01, 2013 11:39 AM
To: Yuan, Rebecca
Subject: Re: [R] Range difference of plot two arrays in one plot


On Feb 1, 2013, at 10:28 AM, Yuan, Rebecca wrote:

 Hello all,

 When I tried to plot the following two arrays in one figure with the
 following:

 x = c(0,0,0,10,20,30)
 y = c(40,50,60,70,80,90)
 plot(x, type='o', ylim=c(min(x),max(x)))
 par(new=T)
 plot(y, type='l', ylim=c(min(y),max(y)))

 Found that the first points and last points from those two arrays are 
 overlapping together, but the value 30 is not equal to 90. How could I 
 draw those two arrays in one plot with their real values shown in the 
 plot? i.e., x is in the lower part of the plot and y is in the upper 
 part of the plot.


Use the same ylim for each plot?

-- 

David Winsemius, MD
Alameda, CA, USA

--
This message, and any attachments, is for the intended r...{{dropped:2}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transforming 4x3 data frame into 2 column df in R

2013-02-01 Thread arun
Hi,
library(reshape)
foo$id-row.names(foo)
 melt(foo,id.var=id)
   id variable  value
#1   n    w 1.51550092
#2   q    w 0.09977303
#3   r    w 1.17083866
#4   n    x 1.43375720
#5   q    x 0.81737610
#6   r    x 1.24693470
#7   n    y 1.27916240
#8   q    y 1.61234020
#9   r    y 0.87121350
#10  n    z 1.17712300
#11  q    z 0.15107370
#12  r    z 0.84880290
A.K.




- Original Message -
From: Gundala Viswanath gunda...@gmail.com
To: r-h...@stat.math.ethz.ch r-h...@stat.math.ethz.ch
Cc: 
Sent: Friday, February 1, 2013 1:00 AM
Subject: [R] Transforming 4x3 data frame into 2 column df in R

I have the following data frame:

 foo
           w         x         y         z
n 1.51550092 1.4337572 1.2791624 1.1771230
q 0.09977303 0.8173761 1.6123402 0.1510737
r 1.17083866 1.2469347 0.8712135 0.8488029

What I want to do is to change it into :

 newdf
1     n    w 1.51550092
2     q   w 0.09977303
3     r   w 1.17083866
4     n    x 1.43375725
5     q   x 0.81737606
6     r   x 1.24693468
7     n   y 1.27916241
8     q   y 1.61234016
9     r   y 0.87121353
10   n    z 1.17712302
11   q   z 0.15107369
12   r    z 0.84880292

Whtat's the way to do it?

- G.V.

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CFA with lavaan or with SEM

2013-02-01 Thread Rick Bilonick
Not sure if you are aware of the OpenMx SEM package 
(http://openmx.psyc.virginia.edu/). It's a very full-featured structural 
equation modeling package.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summary of data for each year

2013-02-01 Thread arun


Hi,

You could use:
creek - read.csv(creek.csv,sep=\t)
 colnames(creek) - c(date,flow)
creek$date - as.Date(creek$date, %m/%d/%Y)
creek1 - within(creek, year - format(date, '%Y'))
library(data.table)
 creek2- data.table(creek1)
 
creek2[,list(MEAN=.Internal(mean(flow)),MEDIAN=median(flow),MAX=max(flow),MIN=min(flow)),by=list(year)]
  #  year  MEAN  MEDIAN    MAX    MIN
 #1: 1999 0.6365604 0.47695  7.256 0.3187
 #2: 2000 0.2819057 0.20810  2.380 0.1370
 #3: 2001 0.2950348 0.22260  2.922 0.1769
 #4: 2002 0.5345666 0.21190 14.390 0.1279
 #5: 2003 1.0351742 0.71730 10.150 0.3492
 #6: 2004 0.9691180 0.65240 11.710 0.4178
 #7: 2005 1.2338066 0.72790 17.720 0.4722
 #8: 2006 0.5458652 0.42820  3.351 0.2651
 #9: 2007 0.6331271 0.40410  9.629 0.2784
#10: 2008 0.8792396 0.64770  4.596 0.4131
#11: 2009 0.8465300 0.59450  6.383 0.3877
A.K.

- Original Message -
From: Janesh Devkota janesh.devk...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Friday, February 1, 2013 2:32 AM
Subject: [R] Summary of data for each year

Hello All,

I have a data with two columns. In one column it is date and in another
column it is flow data.

I was able to read the data as date and flow data. I used the following
code:

creek - read.csv(creek.csv)
library(ggplot2)
creek[1:10,]
colnames(creek) - c(date,flow)
creek$date - as.Date(creek$date, %m/%d/%Y)

The link to my data is https://www.dropbox.com/s/eqpena3nk82x67e/creek.csv

Now, I want to find the summary of each year. I want to especially know
mean, median, maximum etc.

Thanks.

Janesh

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] obtainl survival curves for single strata

2013-02-01 Thread Bond, Stephen
Dr. Therneau,

You are correct about the fit:

(cph.approve - 
coxph(Surv(fundterm,resp)~I(CommitAmt/1e5)+res+CommitedRate+dayflag+mth+strata(termfac),
 data=dfa, 
subset=(HedgeDate2012-11-15  p1!=FixedOpen), method=efron))

However the output from survfit has individuals in each column and the strata 
are stacked so, in order to use that I have to subset the exact rows for the 
strata I need even though the strata is provided in the newdata argument.
Based on the help file it seems like the function should be able to use the 
strata info and return a single strata per subject. I am pasting my current 
code, which is not fast due to calls to reshape. If sth similar can be achieved 
by calling the compiled code it should run much faster allowing 100k subjects 
to be done in 2-3 min.

To use survfit as is, I would need to achieve subsetting in a for loop (going 
through columns), which is even slower than reshape. In my old code I processed 
subjects one by one in a for loop and that was much slower than grouping all 
subjects from the same strata together as in the code below.

surv.approve - survfit(cph.approve)
b1 - c(1,cumsum(surv.approve$strata)+1)
b2 - cumsum(surv.approve$strata)
b1 - b1[-length(b1)]
stratbins - 
data.frame(termfac=as.integer(substring(names(b2),9,9)),start=b1,finish=b2)

 stratbins
  termfac start finish
1   1 1 93
2   294187
3   3   188282
4   4   283372
5   5   373462
6   6   463553
7   7   554633
8   8   634695
9   9   696789

strats - sort(unique(newapp$termfac))
for (jj in strats){
  cat('strata ',jj,'\n')
  block - newapp[newapp$termfac==jj,]
  surv - 
surv.approve$surv[stratbins[stratbins$termfac==jj,start]:stratbins[stratbins$termfac==jj,finish]]
   
risk - predict(cph.approve,new=block,type=risk,ref=sample)
  newsurv - outer(surv,risk,^)

  days - 
as.Date(outer(surv.approve$time[stratbins[stratbins$termfac==jj,start]:stratbins[stratbins$termfac==jj,finish]],
block$HedgeDate,+))

  fund - t(t(newsurv)*block$CommitAmt)
  fund - rbind(block$CommitAmt,fund)
  fund - -diff(fund)
  fund - as.data.frame(t(fund))
  fund$acct - as.integer(rownames(fund))
  ncols - ncol(fund)-1
  fundlong - 
reshape(fund,dir=long,varying=list(colnames(fund)[1:ncols]),idvar=acct,timevar=daycnt)
  fundlong - fundlong[order(fundlong$acct,fundlong$daycnt),]

  days - matrix(days,nrow=nrow(days))
  days - t(days)
  days - cbind(fund[,acct],days)
  days - as.data.frame(days)
  colnames(days)[1] - acct
  daylong - 
reshape(days,dir=long,varying=list(colnames(days)[2:(ncols+1)]),idvar=acct,timevar=dt)

  daylong - daylong[order(daylong$acct,daylong$dt),]
  daylong$V2 - as.Date(daylong$V2,origin = 1970-01-01)
  fundlong$dt - as.character(daylong$V2)
  
  sqlSave(ch1,fundlong,tmpfund)
  sqlQuery(ch1, insert into RRfund (acct,dt,fund,daycnt) select 
acct,dt,V1,daycnt from tmpfund)
  sqlQuery(ch1,drop table tmpfund)

}

Output looks like:

acctdt  funddaycnt
3623963 2012-11-16 00:00:00 472.99329489343 1
3623963 2012-11-17 00:00:00 5842.482289437712
3623963 2012-11-18 00:00:00 7807.171026727333
3623963 2012-11-19 00:00:00 7244.017697003454
3623963 2012-11-20 00:00:00 7073.831095927985
3623963 2012-11-21 00:00:00 8745.915155128846
3623963 2012-11-22 00:00:00 9473.871528069497
3623963 2012-11-23 00:00:00 12627.88904229168
3623963 2012-11-24 00:00:00 19598.56847130979
3623963 2012-11-25 00:00:00 56094.581213680210
3623963 2012-11-26 00:00:00 25690.439183149 11
3623963 2012-11-27 00:00:00 25420.991525671412
3623963 2012-11-28 00:00:00 30322.375086543413
3623963 2012-11-29 00:00:00 18651.113684675814
3623963 2012-11-30 00:00:00 20291.452806729215


Stephen 

-Original Message-
From: Terry Therneau [mailto:thern...@mayo.edu] 
Sent: Friday, February 01, 2013 10:11 AM
To: r-help@r-project.org; Bond, Stephen
Subject: Re: obtainl survival curves for single strata

Stephen,
I can almost but not quite get my arms around what you are asking.  A bit 
more detail 
would help.  Like
  cph.approve = what kind of object, generated by function __

But, let me make a guess:
cph is the result of coxph, and the model has both covariates and a strata
You want predictioned survival curves, more than 1, of the type covariates 
= a, b,c, 
strata=1  covariates = d,e, f, strata=2, ... for arbitrary covariates and 
strata.

Now, the predicted survival curves for different strata are different lengths.
The survfit.coxph routine gets around this by being verbose: it expects you to 
give it 
covariate sets, and returns
all of the strata for each covariate. This allows it to give a compact result.  
 You can 
always do:
newpred - survfit(cox-model-fit, newdata=something)
newpred[5,17] #  survival curve for the 5th strata, covariates from 

[R] heatmap tile size question

2013-02-01 Thread Iain Gallagher
Hello List

I was wondering if it is possible to make the individual 'tiles' in a heatmap 
larger. Often when I plot heatmaps and want to label the rows with eg gene 
names I either have to shrink the text or leave it out altogether as it becomes 
so small as to be unreadable. I wondered if there was a way to make the 'tiles' 
deeper and therefore allow more room for the row labels.

I apologise if this is not entirely clear but the toy code below might 
illustrate the problem.

library(hgu133plus2.db) 
# fake up some data
testMat - matrix(rnorm(4500, 4, 2), 150, 30)
testMat[,15:30] - testMat[,15:30]*4 # clear differnce

geneNames - as.list(hgu133plus2SYMBOL) # get some gene names
rownames(testMat) - stack(sample(geneNames, 150))[,1] # relabel the matrix

heatmap(testMat) # row labels overlap

Best

Iain

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fastest way to compare a single value with all values in one column of a data frame

2013-02-01 Thread Dimitri Liakhovitski
I've compared the solutions.

*Solution 1:*
myf - function( df1, df2 ){
  cond - df2$a  min(df1$a)
  if( cond )
  {
idx - which( df1$a == min(df1$a) )
df1[idx, ] - df2[1, ]
  }
  df1
}

# On a larger example,
set.seed(4530)
tst - data.frame(item = 1:1000,a = rnorm(1000),b = rnorm(1000)) # large
data frame
u-tst
system.time(
for(i in 1:10){
  y-data.frame(item=(1000+i),a=rnorm(1),b=rnorm(1)) # small data frame,
every time new
  u - myf(u, y)
})

Took me about 31.90 sec
 *Solution 2:*
set.seed(4530)
x - data.frame(item = 1:1000,a = rnorm(1000),b = rnorm(1000)) # large data
frame
system.time(
for(i in 1:10){
  y-data.frame(item=(1000+i),a=rnorm(1),b=rnorm(1)) # small data frame,
every time new
  u[intersect(which(u$a  y$a),which.min(u$a)),] - y
})
The solution is correct (despite warnings) but took longer - about 48.84
sec.

Dimitri


On Wed, Jan 30, 2013 at 3:27 PM, Dimitri Liakhovitski 
dimitri.liakhovit...@gmail.com wrote:

 In realy, values in a will be not integers, but numeric. They will never
 be identical, but it could be that they are pretty close - I don't know
 after how many points after the comma matter.
 Dimitri

  On Wed, Jan 30, 2013 at 2:06 PM, arun smartpink...@yahoo.com wrote:

 Hi,
 Any chance x$a to have the same number repeated?

 If `Item` and `a` are unique,  I guess both the solutions should work.

 set.seed(1851)
 x-
 data.frame(item=sample(letters[1:20],20,replace=F),a=sample(1:45,20,replace=F),b=sample(20:50,20,replace=F),stringsAsFactors=F)
 y- data.frame(item=z,a=3,b=10,stringsAsFactors=F)

 x[intersect(which(x$a  y$a),which.min(x$a)),]
  #  item a  b
 #17c 1 48
  x[x$a==which.min(x$a[x$ay$a]),]
 #   item a  b
 #17c 1 48
 #or

 x[x$a%in%which.min(x$a[x$ay$a]),]
 #   item a  b
 #17c 1 48

 x[x$a%in%which.min(x$a[x$ay$a]),]-y

 tail(x)
 #   item  a  b
 #15q 45 30
 #16g 10 23
 #17z  3 10
 #18r 15 39
 #19l 18 45
 #20t 35 33

 #However, if `item` column is unique, but `a` is not, then the one I
 mentioned previously arise.
 set.seed(1851)
 x1-
 data.frame(item=sample(letters[1:20],20,replace=F),a=sample(1:10,20,replace=T),b=sample(20:50,20,replace=F),stringsAsFactors=F)
 y1- data.frame(item=z,a=3,b=10,stringsAsFactors=F)


 x1[intersect(which(x1$a  y1$a),which.min(x1$a)),]
  # item a  b
 #3s 1 41
 x1[x1$a==which.min(x1$a[x1$ay1$a]),]
  #  item a  b
 #3 s 1 41
 #11h 1 46
 #17c 1 48
 x1[x1$a==which.min(x1$a[x1$ay1$a]),]- y1
 A.K.


 
 From: Dimitri Liakhovitski dimitri.liakhovit...@gmail.com
 To: arun smartpink...@yahoo.com
 Cc: R help r-help@r-project.org; Jessica Streicher 
 j.streic...@micromata.de
 Sent: Wednesday, January 30, 2013 1:49 PM
 Subject: Re: [R] Fastest way to compare a single value with all values in
 one column of a data frame


 Sorry - I should have clarified:
 My identifiers (in column item) will always be unique. In other words,
 one entry in column item will never be repeated - neither in x nor in y.
 Dimitri


 On Wed, Jan 30, 2013 at 1:27 PM, Dimitri Liakhovitski 
 dimitri.liakhovit...@gmail.com wrote:

 Thank you, everyone! I'll try to test those different approaches. Really
 appreciate your help!
 Dimitri
 
 
 On Wed, Jan 30, 2013 at 11:03 AM, arun smartpink...@yahoo.com wrote:
 
 HI,
 
 Sorry, my previous solution doesn't work.
 This should work for your dataset:
 set.seed(1851)
 x-
 data.frame(item=sample(letters[1:5],20,replace=TRUE),a=sample(1:15,20,replace=TRUE),b=sample(20:30,20,replace=TRUE),stringsAsFactors=F)
 y- data.frame(item=f,a=3,b=10,stringsAsFactors=F)
  x[x$a%in%which.min(x[x$ay$a,]$a),]- y #if there are multiple minimum
 values
 
 set.seed(1241)
 x1-
 data.frame(item=sample(letters[1:10],1e4,replace=TRUE),a=sample(1:30,1e4,replace=TRUE),b=sample(1:100,1e4,replace=TRUE),stringsAsFactors=F)
 y1- data.frame(item=f,a=3,b=10,stringsAsFactors=F)
 length(x1$a[x1$a==1])
 #[1] 330
  system.time({x1[x1$a%in%which.min(x1[x1$ay1$a,]$a),]- y1})
 #   user  system elapsed
  # 0.000   0.000   0.001
 length(x1$a[x1$a==1])
 #[1] 0
 
 
 #For some reason, it is not working when the multiple number of minimum
 values  some value
 
 set.seed(1241)
 x1-
 data.frame(item=sample(letters[1:10],1e5,replace=TRUE),a=sample(1:30,1e5,replace=TRUE),b=sample(1:100,1e5,replace=TRUE),stringsAsFactors=F)
 y1- data.frame(item=f,a=3,b=10,stringsAsFactors=F)
 length(x1$a[x1$a==1])
 #[1] 3404
 x1[x1$a%in%which.min(x1[x1$ay1$a,]$a),]- y1
  length(x1$a[x1$a==1])
 #[1] 3404 #not getting replaced
 
 #However, if I try:
 set.seed(1241)
  x1-
 data.frame(item=sample(letters[1:10],1e6,replace=TRUE),a=sample(1:5000,1e6,replace=TRUE),b=sample(1:100,1e6,replace=TRUE),stringsAsFactors=F)
  y1- data.frame(item=f,a=3,b=10,stringsAsFactors=F)
  length(x1$a[x1$a==1])
 #[1] 208
  system.time(x1[x1$a%in%which.min(x1[x1$ay1$a,]$a),]- y1)
 #user  system elapsed
  # 0.124   0.016   0.138
   length(x1$a[x1$a==1])
 #[1] 0
 
 
 #Tried Jessica's solution:
 set.seed(1851)
  x-
 

[R] Change default order of colors line types

2013-02-01 Thread Soyeon Kim
Dear R users,

I'd like to change the default order of colors  line types.
Especially I am using ggplot2 and using color Set1.
In Set1, the default color order is red, blue, green, violet,.. ect.
However, I want to put red in fourth (not first).
Likewise, I want to change the order of default linetype. I want to
put solid line in fourth.
How can I do thses?

R code to draw the graph is
qplot(variable, power, data = m.powers, colour = method,
linetype=method,  ylab = Power) + geom_line(aes(group = method),
ylim = c(0,1)) +
  scale_colour_brewer(palette=Set1)

Thank you,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Loading a list into the environment

2013-02-01 Thread Jonathan Greenberg
R-helpers:

Say I have a list:

myvariables - list(a=1:10,b=20)

Is there a way to load the list components into the environment as
variables based on the component names?  i.e. by applying this theoretical
function to myvariables I would have the variables a and b loaded into the
environment without having to explicitly define them.

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Armadillo error in R extension

2013-02-01 Thread Simon Zehnder
Is there anyway with some experience in using armadillo in R C++ extensions?

My problem is the following:

I programmed a function in a header looking like

#include armadillo

inline arma::vec foo(input) {

... do something

return an arma::vec object 
}

compiling this via R CMD INSTALL packagename (PKG_CXXFLAGS = 
-I/folder/of/armadillo and armadillo_bits in my package)

I get the following error 

error: expected initializer before 'foo' 

The exact line and word is where inline arma::vec ends.

Does anyone know what this error could be? 


Best

Simon
 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Change default order of colors line types

2013-02-01 Thread Soyeon Kim
Dear R users,

I'd like to change the default order of colors  line types.
Especially I am using ggplot2 and using color Set1.
In Set1, the default color order is red, blue, green, violet,.. ect.
However, I want to put red in fourth (not first).
Likewise, I want to change the order of default linetype. I want to
put solid line in fourth.
How can I do thses?

R code to draw the graph is
qplot(variable, power, data = m.powers, colour = method,
linetype=method,  ylab = Power) + geom_line(aes(group = method),
ylim = c(0,1)) +
  scale_colour_brewer(palette=Set1)

Thank you,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] order function

2013-02-01 Thread Kripa R

Hi I'm having a simple issue with the order function. When I use the following 
code my data is not ordered correctly  output[order(output[,3]),]

Namebeta pval
881  9.09303277751237 0.000100253350483199
74026.40553461638365 0.00010228641631914
4879   -8.88509881106217 0.000103251645995887
 However when I export the data and sort it in excel I see the following: Name  
   beta pval   pval
25037   -5.70737 2.48E-07
34294   -19.6931 1.04E-05
36002   -12.2478 1.63E-05  Any suggestions on how I can get 
this sort to work properly on data in scientifict format?  

 
 
  
  
  
  
 
 
  
  
  
  
 
 
  
  
  
  
 
 
  
  
  
  
 


.kripa
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] order function

2013-02-01 Thread Kripa R

Hi I'm having a simple issue with the order function. When I use the following 
code my data is not ordered correctly
 
 output[order(output[,3]),]

Namebeta pval
881  9.09303277751237 0.000100253350483199
74026.40553461638365 0.00010228641631914
4879   -8.88509881106217 0.000103251645995887

 
However when I export the data and sort it in excel I see the following: 
Name beta pval   pval
25037   -5.70737 2.48E-07
34294   -19.6931 1.04E-05
36002   -12.2478 1.63E-05
 
 
Any suggestions on how I can get this sort to work properly on data in 
scientifict format?

.kripa
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cumulative sum by group and under some criteria

2013-02-01 Thread Zjoanna
I am new to this mailing list. Is everything I posted seen for everyone in
the mailing list? or just you can see the post?

On Fri, Feb 1, 2013 at 11:33 AM, arun kirshna [via R] 
ml-node+s789695n465731...@n4.nabble.com wrote:

 HI,
 Could you show the modified code and also str(dataset)?
 A.K.

 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://r.789695.n4.nabble.com/cumulative-sum-by-group-and-under-some-criteria-tp4657074p4657316.html
  To unsubscribe from cumulative sum by group and under some criteria, click
 herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4657074code=WmpvYW5uYTIwMTNAZ21haWwuY29tfDQ2NTcwNzR8LTE3NTE1MDA0MzY=
 .
 NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://r.789695.n4.nabble.com/cumulative-sum-by-group-and-under-some-criteria-tp4657074p4657319.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nested loop and output help

2013-02-01 Thread staysafe23
Thank you very much Petr,

I believe I have fixed my inquiry to not use floating points in my cycle as
you pointed out and to use the list structure to keep my results. I am
still at a loss as to how to run the multiple loops. I have tried quite a
few different strategies but my failure seems to illustrate that my
understanding of how the loops will run is nonexistent.

I would like to simultaneously let the following 4 things vary:

z1 - rnorm(ss,mean=400, sd=70) and z2 - rnorm(ss,mean=450, sd=90) by ss
- seq(5,9,by=1) which yields 5 6 7 8 9

r - cc by cc - seq(-0.5,0.5, by=0.25) which yields -0.50 -0.25 0.00 0.25
0.50

dev1 - oo1 by oo1 - seq(-10,10, by=5) which yields -10 -5 0 5 10

dev2 - oo2 by oo2 - seq(0,20, by=5) which yields 0 5 10 15 20

I tried to run the loops that would vary each of these above conditions
with the looped code attached below and failed very badly.

Thank you Petr and all,

Best,

Thomas

###SINGLE RUN
CODE

lll - vector(mode = list, length = 10)

names(lll) - c(mat1, mat2, mat3, mat4, cut1, cut2, out3a,
out3b, out4a, out4b)

z1 - rnorm(20,mean=400, sd=70)

z2 - rnorm(20,mean=450, sd=90)

cor - runif(1,min=0.4, max=0.6)

X - z1

Y = cor*z1+(1-cor)*z2

lll[[mat1]] - cbind(X,Y)

dev1 - sample(-40:40, 1, replace=T)

lll[[cut1]] - mean(X) + dev1

dev2 - sample(12:54, 1, replace=T)

lll[[cut2]] - mean(X) + dev1 + dev2

X2 - X-lll[[cut1]]

Y2 - Y-lll[[cut2]]

c3 - cor(X2,Y2)

lll[[mat2]] -cbind(X2,Y2)

a11 - ifelse( X  lll[[cut1]]  Y  lll[[cut2]], 1, 0)

a12 - ifelse( X  lll[[cut1]]  Y = lll[[cut2]], 1, 0)

a21 - ifelse( X = lll[[cut1]]  Y  lll[[cut2]], 1, 0)

a22 - ifelse( X = lll[[cut1]]  Y = lll[[cut2]], 1, 0)

lll[[mat3]] -matrix(c(sum(a11),sum(a21), sum(a12),sum(a22)), nrow = 2)

lll[[mat4]] -matrix(c(sum(a11),sum(a22), sum(a12),sum(a21)), nrow = 2)

lll[[out3a]] - mcnemar.test(lll[[mat3]], correct=FALSE)

lll[[out3b]] - mcnemar.test(lll[[mat3]], correct=TRUE)

lll[[out4a]] - chisq.test(lll[[mat4]], correct = FALSE)

lll[[out4b]] - chisq.test(lll[[mat4]], correct = TRUE)

print(lll)

capture.output(print(lll), file = C:/Chi_Square_fix/temp.txt, append =
TRUE)

##LOOPED
CODE#

lll - vector(mode = list, length = 10)

names(lll) - c(mat1, mat2, mat3, mat4, cut1, cut2, out3a,
out3b, out4a, out4b)

ss - seq(5,9,by=1)

cc - seq(-0.5,0.5, by=0.25)

oo1 - seq(-10,10, by=5)

oo2 - seq(0,20, by=5)

for(i in ss) {

for (j in cc) {

for (k in oo1) {

for (l in oo2) {

z1 - rnorm(ss,mean=400, sd=70)

z2 - rnorm(ss,mean=450, sd=90)

r - cc

X - z1

Y = r*z1+(1-r)*z2

lll[[mat1]] - cbind(X,Y)

dev1 - oo1

lll[[cut1]] - mean(X) + dev1

dev2 - oo2

lll[[cut2]] - mean(X) + dev1 + dev2

X2 - X-lll[[cut1]]

Y2 - Y-lll[[cut2]]

c3 - cor(X2,Y2)

lll[[mat2]] -cbind(X2,Y2)

a11 - ifelse( X  lll[[cut1]]  Y  lll[[cut2]], 1, 0)

a12 - ifelse( X  lll[[cut1]]  Y = lll[[cut2]], 1, 0)

a21 - ifelse( X = lll[[cut1]]  Y  lll[[cut2]], 1, 0)

a22 - ifelse( X = lll[[cut1]]  Y = lll[[cut2]], 1, 0)

lll[[mat3]] -matrix(c(sum(a11),sum(a21), sum(a12),sum(a22)), nrow = 2)

lll[[mat4]] -matrix(c(sum(a11),sum(a22), sum(a12),sum(a21)), nrow = 2)

lll[[out3a]] - mcnemar.test(lll[[mat3]], correct=FALSE)

lll[[out3b]] - mcnemar.test(lll[[mat3]], correct=TRUE)

lll[[out4a]] - chisq.test(lll[[mat4]], correct = FALSE)

lll[[out4b]] - chisq.test(lll[[mat4]], correct = TRUE)

print(lll)

capture.output(print(lll), file = C:/Chi_Square_fix/temp.txt, append =
TRUE)

}

}

}

}
On Feb 1, 2013 2:01 AM, PIKAL Petr petr.pi...@precheza.cz wrote:

 Hi

 see inline

  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
  project.org] On Behalf Of staysafe23
  Sent: Friday, February 01, 2013 1:01 AM
  To: r-help@r-project.org
  Subject: [R] Nested loop and output help
 
  Hello Everyone,
 
  My name is Thomas and I have been using R for one week. I recently
  found your site and have been able to search the archives of posts.
  This has given me some great information that has allowed me to craft
  an initial design to an inquiry I would like to make into the breakdown
  of McNemar's test. I have read an intro to R manual and the posting
  guides and hope I am not violating them with this post. If so I will
  re-ask my question in the proper format.
 
  I have succeeded in writing a loop to vary one condition of my inquiry
  but I am unable to understand how to vary the remaining three
  conditions, each with its own for loop. Each time I try to do so I fail
  miserably. Here is my current code :
 
  n - seq(5,10,by=1)
 
  for(i in n) {
 
  z1 - rnorm(n,mean=400, sd=70)
 
  z2 - rnorm(n,mean=450, sd=90)
 
  cor - runif(1,min=0.4, max=0.6)
 
  X - z1
 
  Y = cor*z1+(1-cor)*z2
 
  mat1 - cbind(X,Y)
 
  dev1 - sample(-40:40, 1, replace=T)
 
  cut1 - mean(X) + dev1
 
  dev2 - sample(12:54, 1, replace=T)
 
  cut2 - mean(X) + dev1 + dev2
 
  X2 - X-cut1
 
  Y2 - Y-cut2
 

Re: [R] order function

2013-02-01 Thread Duncan Murdoch

On 13-02-01 12:15 PM, Kripa R wrote:


Hi I'm having a simple issue with the order function. When I use the following 
code my data is not ordered correctly


output[order(output[,3]),]


Namebeta pval
881  9.09303277751237 0.000100253350483199
74026.40553461638365 0.00010228641631914
4879   -8.88509881106217 0.000103251645995887


That's correctly ordered in the pval column.


However when I export the data and sort it in excel I see the following:
Name beta pval   pval
25037   -5.70737 2.48E-07
34294   -19.6931 1.04E-05
36002   -12.2478 1.63E-05


That is a different dataset, also correctly ordered in the third column. 
Excel seems to have added a 4th heading for some reason.



Any suggestions on how I can get this sort to work properly on data in 
scientifict format?


You need to explain what is wrong before we can fix it.

Duncan Murdoch



.kripa  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loading a list into the environment

2013-02-01 Thread Gabor Grothendieck
On Fri, Feb 1, 2013 at 5:24 PM, Jonathan Greenberg j...@illinois.edu wrote:
 R-helpers:

 Say I have a list:

 myvariables - list(a=1:10,b=20)

 Is there a way to load the list components into the environment as
 variables based on the component names?  i.e. by applying this theoretical
 function to myvariables I would have the variables a and b loaded into the
 environment without having to explicitly define them.

?list2env

--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Armadillo error in R extension

2013-02-01 Thread Michael Weylandt
Look up the terribly wonderful RcppArmadillo package. 

MW

On Feb 1, 2013, at 8:38 PM, Simon Zehnder szehn...@uni-bonn.de wrote:

 Is there anyway with some experience in using armadillo in R C++ extensions?
 
 My problem is the following:
 
 I programmed a function in a header looking like
 
 #include armadillo
 
 inline arma::vec foo(input) {
 
... do something

return an arma::vec object 
 }
 
 compiling this via R CMD INSTALL packagename (PKG_CXXFLAGS = 
 -I/folder/of/armadillo and armadillo_bits in my package)
 
 I get the following error 
 
 error: expected initializer before 'foo' 
 
 The exact line and word is where inline arma::vec ends.
 
 Does anyone know what this error could be? 
 
 
 Best
 
 Simon
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Armadillo error in R extension

2013-02-01 Thread Dirk Eddelbuettel

On 1 February 2013 at 21:38, Simon Zehnder wrote:
| Is there anyway with some experience in using armadillo in R C++ extensions?

Yes, sure -- we wrote (and use) a package RcppArmadillo that provides very
easy access to Armadillo via Rcpp.
 
| My problem is the following:
| 
| I programmed a function in a header looking like
| 
| #include armadillo
| 
| inline arma::vec foo(input) {
| 
|   ... do something
|   
|   return an arma::vec object 
| }
| 
| compiling this via R CMD INSTALL packagename (PKG_CXXFLAGS = 
-I/folder/of/armadillo and armadillo_bits in my package)
| 
| I get the following error 
| 
| error: expected initializer before 'foo' 
| 
| The exact line and word is where inline arma::vec ends.
| 
| Does anyone know what this error could be? 

We do. I would like to invite to subscribe to the rcpp-devel list, to peruse
its list archives, to study the RcppArmadillo examples (which cover all this)
and, if you still have questions, pose a small reproducible example on the
rcpp-devel list.

Also, we recently opened the Rcpp Gallery (http://gallery.rcpp.org) which
has a set of RcppArmadillo related posts you can read via the tags

http://gallery.rcpp.org/tags/armadillo/

Hope this helps,  Dirk


-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loading a list into the environment

2013-02-01 Thread Rui Barradas

Hello,

Something like this?

myfun - function(x, envir = .GlobalEnv){
nm - names(x)
for(i in seq_along(nm))
assign(nm[i], x[[i]], envir)
}

myvariables - list(a=1:10,b=20)

myfun(myvariables)
a
b


Hope this helps,

Rui Barradas

Em 01-02-2013 22:24, Jonathan Greenberg escreveu:

R-helpers:

Say I have a list:

myvariables - list(a=1:10,b=20)

Is there a way to load the list components into the environment as
variables based on the component names?  i.e. by applying this theoretical
function to myvariables I would have the variables a and b loaded into the
environment without having to explicitly define them.

--j



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Change default order of colors line types

2013-02-01 Thread Jeff Newmiller
Sure. Start by reading [1] and take its advice to heart.
Then study the many ways to create factors with values ordered in sync with the 
color ordering. The color ordering can be set using the default, using any of 
the various color picker functions, or manually.

[1] 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Soyeon Kim yunni0...@gmail.com wrote:

Dear R users,

I'd like to change the default order of colors  line types.
Especially I am using ggplot2 and using color Set1.
In Set1, the default color order is red, blue, green, violet,.. ect.
However, I want to put red in fourth (not first).
Likewise, I want to change the order of default linetype. I want to
put solid line in fourth.
How can I do thses?

R code to draw the graph is
qplot(variable, power, data = m.powers, colour = method,
linetype=method,  ylab = Power) + geom_line(aes(group = method),
ylim = c(0,1)) +
  scale_colour_brewer(palette=Set1)

Thank you,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mountain lion install, error message

2013-02-01 Thread clairehewson@zen
Hi, I have been trying to install R on my mac which is running mountain lion. 
It is partially working, but one error message I am getting is: 

Error in function ()  : object '.activeModel' not found. 

I cannot find anything on this by googling. It appears, for example, when I try 
to 'add observation statistics to data' under the 'models' menu in R commander.

I wondered if anyone might have any idea what this could mean, and how I might 
try and fix my installation. I have installed the various recommended 
additional packages, and uninstalled and re-installed R around 5 times now. I 
have had exactly the same performance after each installation. I'm rather at a 
loss at where to turn next to get things working properly. 

Thanks. 

Claire.

 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help calculating p-values

2013-02-01 Thread Jin Choi
I am trying to figure out how to calculate p-values for the difference in
prevalence of a risk factor between men and women. For example, I find that
277 out of 710 male patients and 125 out of 305 female patients have
obesity, what is the p-value for their difference?

If there is a package that can calculate this in bulk, I would appreciate
to learn about it!

Thank you

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help calculating p-values

2013-02-01 Thread Bert Gunter
This is very basic stuff, so I think your main problem is statistical,
not R related. Try posting on stats.stackexchange.com to get
statistical advice or do some reading about differences in
proportions, contingency tables, and the like.

Incidentally, off the top of my head, I'd say there's no evidence that
the proportions are different. See if I'm right when you do the
correct calculation.

-- Bert

On Fri, Feb 1, 2013 at 3:45 PM, Jin Choi jin.hj.c...@gmail.com wrote:
 I am trying to figure out how to calculate p-values for the difference in
 prevalence of a risk factor between men and women. For example, I find that
 277 out of 710 male patients and 125 out of 305 female patients have
 obesity, what is the p-value for their difference?

 If there is a package that can calculate this in bulk, I would appreciate
 to learn about it!

 Thank you

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mountain lion install, error message

2013-02-01 Thread Jeff Newmiller
Perhaps you would benefit from joining R-sig-mac, and/or perusing their 
archives.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

clairehewson@zen zen11...@zen.co.uk wrote:

Hi, I have been trying to install R on my mac which is running mountain
lion. It is partially working, but one error message I am getting is: 

Error in function ()  : object '.activeModel' not found. 

I cannot find anything on this by googling. It appears, for example,
when I try to 'add observation statistics to data' under the 'models'
menu in R commander.

I wondered if anyone might have any idea what this could mean, and how
I might try and fix my installation. I have installed the various
recommended additional packages, and uninstalled and re-installed R
around 5 times now. I have had exactly the same performance after each
installation. I'm rather at a loss at where to turn next to get things
working properly. 

Thanks. 

Claire.

 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mountain lion install, error message

2013-02-01 Thread John Fox
Dear Claire,

This error is due to a bug in the previous version of the Rcmdr package; it
is fixed in the current version on CRAN, version 1.9-4. Simply reinstall the
Rcmdr package from CRAN and verify that you have the correct version.

I hope this helps,
 John

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of clairehewson@zen
 Sent: Friday, February 01, 2013 6:29 PM
 To: r-help@r-project.org
 Subject: [R] mountain lion install, error message
 
 Hi, I have been trying to install R on my mac which is running mountain
 lion. It is partially working, but one error message I am getting is:
 
 Error in function ()  : object '.activeModel' not found.
 
 I cannot find anything on this by googling. It appears, for example,
 when I try to 'add observation statistics to data' under the 'models'
 menu in R commander.
 
 I wondered if anyone might have any idea what this could mean, and how I
 might try and fix my installation. I have installed the various
 recommended additional packages, and uninstalled and re-installed R
 around 5 times now. I have had exactly the same performance after each
 installation. I'm rather at a loss at where to turn next to get things
 working properly.
 
 Thanks.
 
 Claire.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mountain lion install, error message

2013-02-01 Thread John Fox
Dear Jeff,

As I just explained to Claire, this error is caused by a bug in an earlier
version of the Rcmdr package; the bug affects the Rcmdr on all platforms,
not just Mac OS X, and is fixed in the current version 1.9-4 of the Rcmdr on
CRAN.

Best,
 John

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Jeff Newmiller
 Sent: Friday, February 01, 2013 8:14 PM
 To: clairehewson@zen; r-help@r-project.org
 Subject: Re: [R] mountain lion install, error message
 
 Perhaps you would benefit from joining R-sig-mac, and/or perusing their
 archives.
 
 ---
 Jeff NewmillerThe .   .  Go
 Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
 Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.
 rocks...1k
 
 ---
 Sent from my phone. Please excuse my brevity.
 
 clairehewson@zen zen11...@zen.co.uk wrote:
 
 Hi, I have been trying to install R on my mac which is running mountain
 lion. It is partially working, but one error message I am getting is:
 
 Error in function ()  : object '.activeModel' not found.
 
 I cannot find anything on this by googling. It appears, for example,
 when I try to 'add observation statistics to data' under the 'models'
 menu in R commander.
 
 I wondered if anyone might have any idea what this could mean, and how
 I might try and fix my installation. I have installed the various
 recommended additional packages, and uninstalled and re-installed R
 around 5 times now. I have had exactly the same performance after each
 installation. I'm rather at a loss at where to turn next to get things
 working properly.
 
 Thanks.
 
 Claire.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] x-axis labelling

2013-02-01 Thread David Winsemius


On Feb 1, 2013, at 9:30 AM, Alfredo Tello wrote:


Hi Folks!

I have a question regarding an issue on which I´ve read a few  
threads but

can´t resolve -- Labelling every other tick mark on the x-axis using
factors. Here´s an example:


Why not:

db-data.frame(date=as.factor(c(1:50)),var=rnorm(50))
barplot(db$var,names.arg=ifelse( rep( c(TRUE,FALSE),  
length=length(levels(db$date) ) ),  levels(db$date),  
),las=2,cex.axis=0.8,cex=0.8)


barplot(db$var,names.arg=ifelse( c(TRUE,FALSE),  levels(db$date),  
},las=2,cex.axis=0.8,cex=0.8)



-- David.
This labels all the tickmarks in my barplot, which is the default  
behavior.

I would like to label every other tick mark in the barplot --
1,3,5,7 and so on.

Thank you very much for your help!


All the best,

A

--
Alfredo Tello (http://alfredotello.com)
Sustainable Aquaculture Group
Institute of Aquaculture
University of Stirling
Scotland, UK.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] repeating autocovariate functions

2013-02-01 Thread rachel buxton


Hi there,
Just wondering why my post was rejected?
cheersRachel

Subject: repeating autocovariate functions
From: r-help-ow...@r-project.org
To: moy...@hotmail.com
Date: Sat, 2 Feb 2013 02:56:27 +0100

Message rejected by filter rule match
 


--Forwarded Message Attachment--
Date: Fri, 1 Feb 2013 17:56:14 -0800
From: moy...@hotmail.com
To: r-help@r-project.org
Subject: repeating autocovariate functions

Hi there,
 
I would like to repeat an autocovariate term calculation using 30 different
neighborhood sizes.  Then I would like to run a nominal logistic regression
on the generated autocovariate values and their respective neighborhood
sizes to see which would be most appropriate to use in the final calculation
of my autocovariate term.
 
I have a matrix of x,y values:
x   y
174.7173-35.967
174.7166-35.9649
174.7174-35.968
174.7418-35.9678
174.741 -35.9672
174.7395-35.9671
(and 150 more)
 
To calculate the autocovariate terms my code is as follows:
library(spdep)
Taranga - read.csv(file.choose()) #contains xy coordinates
xy - cbind(Taranga$x, Taranga$y)
acinvb - autocov_dist(Taranga$burrows.pa, xy, nbs=1,zero.policy = TRUE, 
type=inverse)
datfrm - data.frame(autocov=acinvb, nbs=1) 
write.table(dfrm, file = 'results.csv',sep=,,row.names = FALSE) 
 
acinva - autocov_dist(Taranga$burrows.pa, xy, nbs=100,zero.policy = TRUE, 
type=inverse)
dfrm -data.frame(autocov=acinva, nbs=100)
names(datfrm) - NULL
write.table(datfrm, file = 'results.csv',sep=,,append=TRUE, row.names =
FALSE, col.names=FALSE) 
 
I want to repeat this function, each time adding 100 to nbs
 
Then I will run a model from the resulting results.csv
 
thanks in advance for any insight!
 
Rachel
 
 
 
--
View this message in context: 
http://r.789695.n4.nabble.com/repeating-autocovariate-functions-tp4657353.html
Sent from the R help mailing list archive at Nabble.com.
 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help calculating p-values

2013-02-01 Thread Rolf Turner


This smacks of homework to me.

cheers,

Rolf Turner

On 02/02/2013 12:45 PM, Jin Choi wrote:

I am trying to figure out how to calculate p-values for the difference in
prevalence of a risk factor between men and women. For example, I find that
277 out of 710 male patients and 125 out of 305 female patients have
obesity, what is the p-value for their difference?

If there is a package that can calculate this in bulk, I would appreciate
to learn about it!


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Choice of text for intermediate level R programming course

2013-02-01 Thread William Grove
The Subject line mostly says it.  I'm designing it as a semester-long, 3
hours per week, course
that takes in students who got the basics of R in stats classes, but don't
really know how to
program in it.  Translation:  if their own examples don't look enough like
examples from
previous work, they're stumped.

Does anybody have a text for an intermediate R course (but not too
highbrow, no total guide
to object oriented programming, and class s3 vs. s4 objects), they would
recommend.  I
read 24 books some ielementary some advance not many intermediate, and I
have two I
like but for quite difference reasons.  I won't name them so as not to
prejudice anybody.

It needs to be English language and I greatly prefer a good index like
\LaTeX\ produces
by the makeindex() prograrm.

Regards,
William Grove

-- 
William M. Grove, Ph.D.
Psychology Department
University of Minnesota

My public encryption key.is:

-BEGIN PGP PUBLIC KEY BLOCK-
Version: GnuPG v1.2.2 (MingW32)
mQGiBE+tS9ERBACCTYW1uclqHbUvBvP92rUN8zOlMWe2zr3NfRp6ELNyYa60H5PK
+Xx/vbHGWF0MXhGLLVFMiRNBRomJ4S30Ify4exTg6vWpPrVsmDvyKLmYilWSOuw1
0yoimDdr4DphGQSKqVD/O/tLUhO1wyEKJI89qtJmFG4usnn7f/Bvt4cRZwCgq76W
LRhGATk0E2GYl0n+2UeFU+8D/ipLpFYNJEkz3y492/9orimBsxsAdnXmK5bpOBG5
pQaoFuvvPMtkUYhVNYUqMyjJniM3F6mnv7OX52sHHcYn89ScO9S+WtWAYsekwTTE
xb1wvwB9u7PoGDC46tihXhTl533oUcMFGMwlVKvFZ4zpWYo5W3i0DmOAtjcMPzl2
4wcfA/0fgbUe2JIkPTf6MFMQTFwpNhCGeDvTfbcC5YCCcAXWcnUT/GKFsvRT0Wje
i/A4PMopJkfLRreM1GSZ6JY0cOL3U071M6YhVmldrL4X74OUAjfdo8mYosxe/qTe
Ji0s6jg/oiXgY/hHLZzDE+AtQ6LRi0/1tpms2Qw9/7pXs8HHErQ0V2lsbGlhbSBN
LiBHcm92ZSAoR1BHIFB1YmxpYyBLZXkpIDxncm92ZTAwMUB1bW4uZWR1PohbBBMR
AgAbBQJPrUvRBgsJCAcDAgMVAgMDFgIBAh4BAheAAAoJEG2MPb65SVkFnNcAnRCi
URO4Oy9dk5Kb6j/1yN1iElTJAKCnlwfwOt0J8jf2nJ78sNkiZNjM6bkCDQRPrUvT
EAgAg52fj9/7k6BuJgmY+ynKm4rwWaRW6VjrnS5VI+luTmEkQi/w9Vz6dOmCJWlz
VLsm7mEW0tvB9VwZbkrttoWnJOmqzRWNJxlRlj78jRq6yN16JZNhbPGDSbjL/+JV
LzlhMetUpuTYPziuatSr7C2RxYYRd5pabgd3G/XowqSBtyD3jrViJ8CqcZlMQGfJ
BRoiyBDlVEOO+aJ/KGLinfbhyz06tR50BpXFFlxBN1m+DInRLF6zRkjZUO+OyFO6
Acy0SM4htfetEI0H0+X1w/Cw8e2Ew48Wwsve+USRr+OLXqwkgBP/jTNRko6cUI5m
LiXjkSXYdR0JonTzgUSh66TDBwADBQf/TR/xC5P5Vs6X5OPu3iHsKRp/dGVpL33D
KBH2Ofw3ps48kQ65la/uMW+fZDvwZJukTIJL4dq5OGjxnwznj3hkZJZWLknRFDx+
ByoumskJXPENphqlnpDV+W2stNwoSY3f5gO9xTQSGeoqqLB6+Xwe4CDZo/wOuhMA
nCXQeIJxiH3GsCnxkZY9e3rw9HdcE/6NbvKGZ/OOLMdorASG9eSuyQZDgKN7jXZ3
2GPbppaRjzUWuRGau1RBv9V0QMY5C0Z1c51njCHuR0cgcItad69IcgaujlC2kouu
C/MW/VOA5rfGhvBBI7nr26MjhxwHUYWQw6i/KgzYwJ+X43cEBCoITIhGBBgRAgAG
BQJPrUvTAAoJEG2MPb65SVkFDLoAn36dv6dBZyIrTD+lF3y1lXO/R1OHAJ0Sn0bi
0TdH/wwxUEFeq5eCMfoZTg==
=91Ch
-END PGP PUBLIC KEY BLOCK-

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cumulative sum by group and under some criteria

2013-02-01 Thread arun
Hi,

Saw your reply on Nabble:
#Your code:

library(zoo)
res1- do.call(rbind,lapply(lapply(split(d,list(d$m1,d$n1)),function(x) 
 {x$cterm1_p0L[x$Qm=c11]- cumsum(x$term1_p0[x$Qm=c11]); 
  x$cterm1_p0H[x$Qn=c12]- cumsum(x$term1_p0[x$Qn=c12]); 
  x$cterm1_p1L[x$Qm=c11]- cumsum(x$term1_p1[x$Qm=c11]); 
  x$cterm1_p1H[x$Qm=c12]- cumsum(x$term1_p1[x$Qn=c12]); #Check this line Qm 
and Qn  
  x}),function(x) {x$cterm1_p0L-na.locf(x$cterm1_p0L,na.rm=F); 
                   x$cterm1_p0H-na.locf(x$cterm1_p0H,na.rm=F); 
                   x$cterm1_p1L-na.locf(x$cterm1_p1L,na.rm=F); 
                   x$cterm1_p1H-na.locf(x$cterm1_p1H,na.rm=F);x})) 



#should be:
 colnames(d)-c(m1,n1,x1,y1,Fmm, Fnn, Qm, Qn, term1_p0, 
term1_p1) 
res1- do.call(rbind,lapply(lapply(split(d,list(d$m1,d$n1)),function(x) 
{x$cterm1_p0L[x$Qm=c11]- cumsum(x$term1_p0[x$Qm=c11]);
  x$cterm1_p0H[x$Qn=c12]- cumsum(x$term1_p0[x$Qn=c12]); 
 x$cterm1_p1L[x$Qm=c11]- cumsum(x$term1_p1[x$Qm= c11]);
 x$cterm1_p1H[x$Qn=c12]- cumsum(x$term1_p1[x$Qn= c12]);
 x}),function(x) {x$cterm1_p0L-na.locf(x$cterm1_p0L,na.rm=F); 
                   x$cterm1_p0H-na.locf(x$cterm1_p0H,na.rm=F); 
                   x$cterm1_p1L-na.locf(x$cterm1_p1L,na.rm=F); 
                   x$cterm1_p1H-na.locf(x$cterm1_p1H,na.rm=F);x})) 
  
row.names(res1) - 1:nrow(res1)
res1[,11:14][is.na(res1[,11:14])]- 0 


 res1[,11:14][is.na(res1[,11:14])]- 0 
 res1
#   m1 n1 x1 y1  Fmm  Fnn    Qm    Qn     term1_p0 term1_p1  cterm1_p0L   
cterm1_p0H cterm1_p1L cterm1_p1H
#1   2  2  0  0 0.00 0.00 1.000 1.000 0.8145062500  0.40960 0.0e+00 
0.00    0.0    0.0
#2   2  2  0  1 0.00 0.60 1.000 0.400 0.0857375000  0.20480 0.0e+00 
0.00    0.0    0.0
#3   2  2  0  2 0.00 1.00 1.000 0.000 0.0022562500  0.02560 0.0e+00 
0.0022562500    0.0    0.02560
#4   2  2  1  0 0.61 0.00 0.695 0.695 0.0857375000  0.20480 0.0e+00 
0.0022562500    0.0    0.02560
#5   2  2  1  1 0.61 0.62 0.390 0.380 0.009025  0.10240 0.0e+00 
0.0022562500    0.0    0.02560
#6   2  2  1  2 0.63 1.00 0.370 0.000 0.0002375000  0.01280 0.0e+00 
0.0024937500    0.0    0.03840
#7   2  2  2  0 1.00 0.00 0.500 0.500 0.0022562500  0.02560 0.0e+00 
0.0024937500    0.0    0.03840
#8   2  2  2  1 1.00 0.67 0.165 0.165 0.0002375000  0.01280 2.37500e-04 
0.0027312500    0.01280    0.05120
#9   2  2  2  2 1.00 1.00 0.000 0.000 0.062500  0.00160 2.43750e-04 
0.0027375000    0.01440    0.05280
#10  3  2  0  0 0.00 0.00 1.000 1.000 0.7737809375  0.32768 0.0e+00 
0.00    0.0    0.0
#11  3  2  0  1 0.00 0.65 1.000 0.350 0.0814506250  0.16384 0.0e+00 
0.00    0.0    0.0
#12  3  2  0  2 0.00 1.00 1.000 0.000 0.0021434375  0.02048 0.0e+00 
0.0021434375    0.0    0.02048
#13  3  2  1  0 0.67 0.00 0.665 0.665 0.1221759375  0.24576 0.0e+00 
0.0021434375    0.0    0.02048
#14  3  2  1  1 0.60 0.64 0.400 0.360 0.0128606250  0.12288 0.0e+00 
0.0021434375    0.0    0.02048
#15  3  2  1  2 0.66 1.00 0.340 0.000 0.0003384375  0.01536 0.0e+00 
0.0024818750    0.0    0.03584
#16  3  2  2  0 0.71 0.00 0.645 0.645 0.0064303125  0.06144 0.0e+00 
0.0024818750    0.0    0.03584
#17  3  2  2  1 0.69 0.66 0.325 0.325 0.0006768750  0.03072 0.0e+00 
0.0024818750    0.0    0.03584
#18  3  2  2  2 0.64 1.00 0.360 0.000 0.178125  0.00384 0.0e+00 
0.0024996875    0.0    0.03968
#19  3  2  3  0 1.00 0.00 0.500 0.500 0.0001128125  0.00512 0.0e+00 
0.0024996875    0.0    0.03968
#20  3  2  3  1 1.00 0.74 0.130 0.130 0.118750  0.00256 1.18750e-05 
0.0025115625    0.00256    0.04224
#21  3  2  3  2 1.00 1.00 0.000 0.000 0.003125  0.00032 1.21875e-05 
0.0025118750    0.00288    0.04256
#22  2  3  0  0 0.00 0.00 1.000 1.000 0.7737809375  0.32768 0.0e+00 
0.00    0.0    0.0
#23  2  3  0  1 0.00 0.60 1.000 0.400 0.1221759375  0.24576 0.0e+00 
0.00    0.0    0.0
#24  2  3  0  2 0.00 0.65 1.000 0.350 0.0064303125  0.06144 0.0e+00 
0.00    0.0    0.0
#25  2  3  0  3 0.00 1.00 1.000 0.000 0.0001128125  0.00512 0.0e+00 
0.0001128125    0.0    0.00512
#26  2  3  1  0 0.77 0.00 0.615 0.615 0.0814506250  0.16384 0.0e+00 
0.0001128125    0.0    0.00512
#27  2  3  1  1 0.60 0.62 0.400 0.380 0.0128606250  0.12288 0.0e+00 
0.0001128125    0.0    0.00512
#28  2  3  1  2 0.61 0.72 0.390 0.280 0.0006768750  0.03072 0.0e+00 
0.0001128125    0.0    0.00512
#29  2  3  1  3 0.65 1.00 0.350 0.000 0.118750  0.00256 0.0e+00 
0.0001246875    0.0    0.00768
#30  2  3  2  0 1.00 0.00 0.500 0.500 0.0021434375  0.02048 0.0e+00 
0.0001246875    0.0    0.00768
#31  2  3  2  1 1.00 0.58 0.210 0.210 0.0003384375  0.01536 0.0e+00 
0.0001246875    0.0    0.00768
#32  2  3  2  2 1.00 0.60 0.200 0.200 0.178125  0.00384 1.78125e-05 
0.0001425000    0.00384    0.01152
#33  2  3  2  3 1.00 1.00 0.000 

Re: [R] cumulative sum by group and under some criteria

2013-02-01 Thread arun
HI,

I think this should be more correct:
maxN-9 
c11-0.2 
c12-0.2 
p0L-0.05 
p0H-0.05 
p1L-0.20 
p1H-0.20 

d - structure(list(m1 = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3), 
    n1 = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 
    3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), x1 = c(0, 
    0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 
    2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3), y1 = c(0, 1, 2, 0, 
    1, 2, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 
    2, 0, 1, 2, 0, 1, 2, 0, 1, 2), Fmm = c(0, 0, 0, 0.7, 0.59, 
    0.64, 1, 1, 1, 0, 0, 0, 0, 0.63, 0.7, 0.74, 0.68, 1, 1, 1, 
    1, 0, 0, 0, 0.62, 0.63, 0.6, 0.63, 0.6, 0.68, 1, 1, 1), Fnn = c(0, 
    0.64, 1, 0, 0.51, 1, 0, 0.67, 1, 0, 0.62, 0.69, 1, 0, 0.54, 
    0.62, 1, 0, 0.63, 0.73, 1, 0, 0.63, 1, 0, 0.7, 1, 0, 0.7, 
    1, 0, 0.58, 1), Qm = c(1, 1, 1, 0.65, 0.45, 0.36, 0.5, 0.165, 
    0, 1, 1, 1, 1, 0.685, 0.38, 0.32, 0.32, 0.5, 0.185, 0.135, 
    0, 1, 1, 1, 0.69, 0.37, 0.4, 0.685, 0.4, 0.32, 0.5, 0.21, 
    0), Qn = c(1, 0.36, 0, 0.65, 0.45, 0, 0.5, 0.165, 0, 1, 0.38, 
    0.31, 0, 0.685, 0.38, 0.32, 0, 0.5, 0.185, 0.135, 0, 1, 0.37, 
    0, 0.69, 0.3, 0, 0.685, 0.3, 0, 0.5, 0.21, 0), term1_p0 = c(0.81450625, 
    0.0857375, 0.00225625, 0.0857375, 0.009025, 0.0002375, 0.00225625, 
    0.0002375, 6.25e-06, 0.7737809375, 0.1221759375, 0.0064303124999, 
    0.0001128125, 0.081450625, 0.012860625, 0.000676875, 1.1875e-05, 
    0.0021434375, 0.0003384375, 1.78125e-05, 3.125e-07, 0.7737809375, 
    0.081450625, 0.0021434375, 0.1221759375, 0.012860625, 0.0003384375, 
    0.0064303124999, 0.000676875, 1.78125e-05, 0.0001128125, 
    1.1875e-05, 3.125e-07), term1_p1 = c(0.4096, 0.2048, 0.0256, 
    0.2048, 0.1024, 0.0128, 0.0256, 0.0128, 0.0016, 0.32768, 
    0.24576, 0.06144, 0.00512, 0.16384, 0.12288, 0.03072, 0.00256, 
    0.02048, 0.01536, 0.00384, 0.00032, 0.32768, 0.16384, 0.02048, 
    0.24576, 0.12288, 0.01536, 0.06144, 0.03072, 0.00384, 0.00512, 
    0.00256, 0.00032)), .Names = c(m1, n1, x1, y1, Fmm, 
Fnn, Qm, Qn, term1_p0, term1_p1), row.names = c(NA, 
33L), class = data.frame)

library(zoo)
lst1- split(d,list(d$m1,d$n1))
res2-do.call(rbind,lapply(lst1[lapply(lst1,nrow)!=0],function(x){
x[,11:14]-NA;
x[,11:12][x$Qm=c11,]-cumsum(x[,9:10][x$Qm=c11,]);
x[,13:14][x$Qn=c12,]-cumsum(x[,9:10][x$Qn=c12,]);
colnames(x)[11:14]- c(cterm1_P0L,cterm1_P1L,cterm1_P0H,cterm1_P1H);
x1-na.locf(x);
x1[,11:14][is.na(x1[,11:14])]-0;
x1}))
row.names(res2)- 1:nrow(res2)

 res2
 #  m1 n1 x1 y1  Fmm  Fnn    Qm    Qn     term1_p0 term1_p1   cterm1_P0L 
cterm1_P1L   cterm1_P0H cterm1_P1H

#1   2  2  0  0 0.00 0.00 1.000 1.000 0.8145062500  0.40960 0.00    
0.0 0.00    0.0
#2   2  2  0  1 0.00 0.64 1.000 0.360 0.0857375000  0.20480 0.00    
0.0 0.00    0.0
#3   2  2  0  2 0.00 1.00 1.000 0.000 0.0022562500  0.02560 0.00    
0.0 0.0022562500    0.02560
#4   2  2  1  0 0.70 0.00 0.650 0.650 0.0857375000  0.20480 0.00    
0.0 0.0022562500    0.02560
#5   2  2  1  1 0.59 0.51 0.450 0.450 0.009025  0.10240 0.00    
0.0 0.0022562500    0.02560
#6   2  2  1  2 0.64 1.00 0.360 0.000 0.0002375000  0.01280 0.00    
0.0 0.0024937500    0.03840
#7   2  2  2  0 1.00 0.00 0.500 0.500 0.0022562500  0.02560 0.00    
0.0 0.0024937500    0.03840
#8   2  2  2  1 1.00 0.67 0.165 0.165 0.0002375000  0.01280 0.0002375000    
0.01280 0.0027312500    0.05120
#9   2  2  2  2 1.00 1.00 0.000 0.000 0.062500  0.00160 0.0002437500    
0.01440 0.0027375000    0.05280
#10  3  2  0  0 0.00 0.00 1.000 1.000 0.7737809375  0.32768 0.00    
0.0 0.00    0.0
#11  3  2  0  1 0.00 0.63 1.000 0.370 0.0814506250  0.16384 0.00    
0.0 0.00    0.0
#12  3  2  0  2 0.00 1.00 1.000 0.000 0.0021434375  0.02048 0.00    
0.0 0.0021434375    0.02048
#13  3  2  1  0 0.62 0.00 0.690 0.690 0.1221759375  0.24576 0.00    
0.0 0.0021434375    0.02048
#14  3  2  1  1 0.63 0.70 0.370 0.300 0.0128606250  0.12288 0.00    
0.0 0.0021434375    0.02048
#15  3  2  1  2 0.60 1.00 0.400 0.000 0.0003384375  0.01536 0.00    
0.0 0.0024818750    0.03584
#16  3  2  2  0 0.63 0.00 0.685 0.685 0.0064303125  0.06144 0.00    
0.0 0.0024818750    0.03584
#17  3  2  2  1 0.60 0.70 0.400 0.300 0.0006768750  0.03072 0.00    
0.0 0.0024818750    0.03584
#18  3  2  2  2 0.68 1.00 0.320 0.000 0.178125  0.00384 0.00    
0.0 0.0024996875    0.03968
#19  3  2  3  0 1.00 0.00 0.500 0.500 0.0001128125  0.00512 0.00    
0.0 0.0024996875    0.03968
#20  3  2  3  1 1.00 0.58 0.210 0.210 0.118750  0.00256 0.00    
0.0 0.0024996875    0.03968
#21  3  2  3  2 1.00 1.00 0.000 0.000 0.003125  0.00032 0.003125    
0.00032 0.002500    0.04000
#22  2  3  0  0 0.00 0.00 1.000 

Re: [R] repeating autocovariate functions

2013-02-01 Thread David Winsemius

The short answer is ... because it was from Nabble.

The longer answer is that the filters needed to be tightened on Nabble  
postings because of a an autobot spammer and the moderation queue was  
being swamped with cases.


If you send your questions as email (from your email client) to r-help@r-project.org 
 you should have fewer problems. (you might want to learn to post in  
plain text, too.)


--
David

On Feb 1, 2013, at 7:58 PM, rachel buxton wrote:



Hi there,

Just wondering why my post was rejected?

cheers
Rachel

Subject: repeating autocovariate functions
From: r-help-ow...@r-project.org
To: moy...@hotmail.com
Date: Sat, 2 Feb 2013 02:56:27 +0100

Message rejected by filter rule match



--Forwarded Message Attachment--
Date: Fri, 1 Feb 2013 17:56:14 -0800
From: moy...@hotmail.com
To: r-help@r-project.org
Subject: repeating autocovariate functions

Hi there,

I would like to repeat an autocovariate term calculation using 30  
different
neighborhood sizes.  Then I would like to run a nominal logistic  
regression
on the generated autocovariate values and their respective  
neighborhood
sizes to see which would be most appropriate to use in the final  
calculation

of my autocovariate term.

I have a matrix of x,y values:
x   y
174.7173-35.967
174.7166-35.9649
174.7174-35.968
174.7418-35.9678
174.741 -35.9672
174.7395-35.9671
(and 150 more)

To calculate the autocovariate terms my code is as follows:
library(spdep)
Taranga - read.csv(file.choose()) #contains xy coordinates
xy - cbind(Taranga$x, Taranga$y)
acinvb - autocov_dist(Taranga$burrows.pa, xy, nbs=1,zero.policy =  
TRUE,

type=inverse)
datfrm - data.frame(autocov=acinvb, nbs=1)
write.table(dfrm, file = 'results.csv',sep=,,row.names = FALSE)

acinva - autocov_dist(Taranga$burrows.pa, xy, nbs=100,zero.policy =  
TRUE,

type=inverse)
dfrm -data.frame(autocov=acinva, nbs=100)
names(datfrm) - NULL
write.table(datfrm, file = 'results.csv',sep=,,append=TRUE,  
row.names =

FALSE, col.names=FALSE)

I want to repeat this function, each time adding 100 to nbs

Then I will run a model from the resulting results.csv

thanks in advance for any insight!

Rachel



--
View this message in context: 
http://r.789695.n4.nabble.com/repeating-autocovariate-functions-tp4657353.html
Sent from the R help mailing list archive at Nabble.com.



David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.