Re: [R] List of Variables in Original Order

2012-09-28 Thread arun


HI, 
May be this helps you: 
set.seed(1) 
 mat1-matrix(rnorm(60,5),nrow=5,ncol=12) 
colnames(mat1)-paste0(Var,1:12) 
vec2-format(c(1,cor(mat1[,1],mat1[,2:12])),digits=4) 
vec3-colnames(mat1) 
arr2-array(rbind(vec3,vec2),dim=c(2,3,4)) 
res-data.frame(do.call(rbind,lapply(1:dim(arr2)[3],function(i) arr2[,,i]))) 
 res 
#        X1       X2       X3 
#1     Var1     Var2     Var3 
#2  1.0  0.27890 -0.61497 
#3     Var4     Var5     Var6 
#4  0.24916 -0.76155  0.30853 
#5     Var7     Var8     Var9 
#6 -0.46413  0.79287  0.05191 
#7    Var10    Var11    Var12 
#8 -0.06940 -0.53251  0.06766 

A.K. 


- Original Message -
From: rkulp rk...@charter.net
To: r-help@r-project.org
Cc: 
Sent: Thursday, September 27, 2012 6:26 PM
Subject: [R] List of Variables in Original Order

I am trying to Sweave the output of calculating correlations between one
variable and several others. I wanted to print a table where the
odd-numbered rows contain the variable names and the even-numbered rows
contain the correlations. So if VarA is correlated with all the variables in
mydata.df, then it would look like

var1        var2      var3 
corr1      corr2     corr3
var4       var5        var6
corr4     corr5     corr6
.
.
etc.
I tried using a matrix for the correlations and another one for the variable
names. I built the correlation matrix using 
x = matrix(format(cor(mydata.df[,1],mydata.df[,c(2:79)]),digits=4),nc=3) 
and the variable names matrix using 
y = matrix(ls(mydata.df[c(2:79)]),nc=3). 
The problem is the function ls returns the names in alphabetical order,
columnar order.
How do I get the names in columnar order? Is there a better way to display
the correlation of a single variable with a large number of other variables?
If there is, how do I do it? I appreciate any help I can get. This is my
first project in R so I don't know much about it yet.



--
View this message in context: 
http://r.789695.n4.nabble.com/List-of-Variables-in-Original-Order-tp4644436.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to test if there is a subvector in a longer vector

2012-09-28 Thread K. Elo

Hi!

28.09.2012 08:41, Atte Tenkanen wrote:

Sorry. I should have mentioned that the order of the components is important.

So c(1,4,6) is accepted as a subvector of c(2,1,1,4,6,3), but not of 
c(2,1,1,6,4,3).

How to test this?


How about this:

--- code ---

g1- c(2,1,1,4,6,3)
g2- c(2,1,1,6,4,3)
t1- c(1,4,6)
t2-c(9,8)

!is.na(sum(match(t1,g1)))
[1] TRUE
!is.na(sum(match(t1,g2)))
[1] TRUE
!is.na(sum(match(t2,g1)))
[1] FALSE

--- code ---

Kind regads,
Kimmo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simple Question

2012-09-28 Thread Bhupendrasinh Thakre

Hi Everyone,

I am trying a very simple task to append the Timestamp with a variable name so 
something like 
a_2012_09_27_00_12_30 - rnorm(1,2,1).

Tried some commands but it doesn't work out well. Hope someone has some answer 
on it.

Session Info 

R version 2.15.1 (2012-06-22)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] chron_2.3-42twitteR_0.99.19 rjson_0.2.9 RCurl_1.91-1
bitops_1.0-4.1  tm_0.5-7.1  RMySQL_0.9-3DBI_0.2-5  

loaded via a namespace (and not attached):
[1] slam_0.1-24  tools_2.15.1

Statement I tried :

b - unclass(Sys.time())
b = 1348812597
c_b - rnorm(1,2,1)

Works perfect but doesn't show me c_1348812597.

Best Regards,


Bhupendrasinh Thakre






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple Question

2012-09-28 Thread K. Elo

Hi!

28.09.2012 09:13, Bhupendrasinh Thakre wrote:

Statement I tried :

b - unclass(Sys.time())
b = 1348812597
c_b - rnorm(1,2,1)


Do you mean this:

--- code ---

 df-data.frame(x=0,y=0)
 colnames(df)
[1] x y
 colnames(df)[2]-paste(b,unclass(Sys.time()),sep=_)
 colnames(df)
[1] x  b_1348813791.55393

--- code ---

HTH,
Kimmo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple Question

2012-09-28 Thread Pascal Oettli

Hello,

Try the following:


b - unclass(Sys.time())
eval(parse(text=paste(c_,b, - rnorm(1,2,1),sep=)))
ls()

Regards,
Pascal


Le 28/09/2012 15:13, Bhupendrasinh Thakre a écrit :


Hi Everyone,

I am trying a very simple task to append the Timestamp with a variable name so 
something like
a_2012_09_27_00_12_30 - rnorm(1,2,1).

Tried some commands but it doesn't work out well. Hope someone has some answer 
on it.

Session Info

R version 2.15.1 (2012-06-22)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] chron_2.3-42twitteR_0.99.19 rjson_0.2.9 RCurl_1.91-1
bitops_1.0-4.1  tm_0.5-7.1  RMySQL_0.9-3DBI_0.2-5

loaded via a namespace (and not attached):
[1] slam_0.1-24  tools_2.15.1

Statement I tried :

b - unclass(Sys.time())
b = 1348812597
c_b - rnorm(1,2,1)

Works perfect but doesn't show me c_1348812597.

Best Regards,


Bhupendrasinh Thakre






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple Question

2012-09-28 Thread David Winsemius

On Sep 27, 2012, at 11:13 PM, Bhupendrasinh Thakre wrote:

 
 Hi Everyone,
 
 I am trying a very simple task to append the Timestamp with a variable name 
 so something like 
 a_2012_09_27_00_12_30 - rnorm(1,2,1).

If you want to assign a value to a character-name you need to use ... `assign`. 
You cannot just stick a numeric value which is what you get with sys.Time() on 
the LHS of a - and expect R to intuit what you intend.

?assign
assign( a_2012_09_27_00_12_30 ,  rnorm(1,2,1) )
assign( as.character(unclass(Sys.time())) ,  rnorm(1,2,1) )

(I would have thought you wanted to format that sys.Time result:)

 format(Sys.time(), %Y_%m_%d_%H_%M_%S)
[1] 2012_09_27_23_32_40

  assign(format(Sys.time(), %Y_%m_%d_%H_%M_%S),  rnorm(1,2,1) )
 grep(^2012, ls(), value=TRUE)
[1] 2012_09_27_23_33_45


 
 Tried some commands but it doesn't work out well. Hope someone has some 
 answer on it.
 
 Session Info 
 
 R version 2.15.1 (2012-06-22)
 Platform: i386-apple-darwin9.8.0/i386 (32-bit)
 
 locale:
 [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base 
 
 other attached packages:
 [1] chron_2.3-42twitteR_0.99.19 rjson_0.2.9 RCurl_1.91-1
 bitops_1.0-4.1  tm_0.5-7.1  RMySQL_0.9-3DBI_0.2-5  
 
 loaded via a namespace (and not attached):
 [1] slam_0.1-24  tools_2.15.1
 
 Statement I tried :
 
 b - unclass(Sys.time())
 b = 1348812597
 c_b - rnorm(1,2,1)
 
 Works perfect but doesn't show me c_1348812597.
 
 Best Regards,
 
 
 Bhupendrasinh Thakre
   [[alternative HTML version deleted]]

BT; Please learn to post in plain text. It's really very simple with gmail.

-- 

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Annotate a segmented linear regression plot

2012-09-28 Thread David Winsemius

On Sep 27, 2012, at 9:07 PM, Ben Harrison wrote:

 Hello,
 
 I have produced some segmented regressions with the segmented package by
 Viggo Mutteo. I have some example data and code below. I want to annotate
 the individual segments with the slope parameter (actually it would be
 nicer to annotate with 1000*slope and add some small amount of text as
 well). How can I do it? Reading the docs for segmented I can access all of
 the slope parameters via a named vector of the coefficients. How can I
 access the slope segments or locations? I have never tried to annotate an R
 plot before, so I don't even know how to 'pin' a bit of text to an x,y
 location on a plot.
 

?text  # should be fairly clear.


 dput(bullard)
 structure(list(Rt = c(14.4477, 23.6752, 26.723, 33.8508, 37.9628,
 47.0804, 49.7232, 54.6395, 59.9251, 64.7518, 81.1629, 85.7209,
 88.0334, 98.366, 102.6563, 105.6953, 134.8691, 137.3795, 155.0056,
 158.6707, 162.0671, 206.7413, 248.701, 255.9407, 265.5201, 283.1462,
 288.8939, 299.8356, 311.0788, 323.2355, 366.9049, 379.3662, 384.3869,
 392.3246, 436.0853, 439.1246, 454.6023, 458.6247, 464.1744, 479.9764,
 486.5171, 489.5564, 507.5925, 524.7894, 544.0806, 558.7642, 562.4293,
 577.9268, 650.8613, 658.6664, 669.6996, 692.7172, 694.6993),
Tem = c(14.6189, 15.2877, 15.3106, 15.3536, 15.3665, 15.3764,
15.3928, 15.4182, 15.4671, 15.528, 15.5921, 15.7066, 15.7806,
15.8747, 16.0244, 16.146, 16.481, 16.6098, 16.8581, 17.0339,
17.2242, 17.8379, 19.2747, 19.7184, 19.9621, 20.0953, 20.4838,
20.578, 20.774, 21.0112, 23.01, 23.3897, 24.1697, 24.4176,
27.0874, 27.3597, 28.0178, 28.4026, 28.909, 29.7406, 30.532,
30.8734, 32.216, 32.8198, 34.0339, 34.7553, 35.2611, 35.8303,
41.1202, 41.5027, 42.0578, 42.6597, 42.656)), .Names = c(Rt,
 Tem), class = data.frame, row.names = c(NA, -53L))
 
 library(segmented)
 
 out.lm - lm(Tem ~ Rt, data=bullard)
 
 o-segmented(out.lm, seg.Z=~Rt, psi=NA, control=seg.control(display=FALSE,
 K=2))
 
 plot(o, lwd=1,col=2:6, main='Plot title')
 points(bullard)
 abline(out.lm, col=red, lwd=2)
 
 Ben.
 
   [[alternative HTML version deleted]]

Please learn to post in plain text.

--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to test if there is a subvector in a longer vector

2012-09-28 Thread Berend Hasselman

On 28-09-2012, at 07:41, Atte Tenkanen atte...@utu.fi wrote:

 Sorry. I should have mentioned that the order of the components is important.
 
 So c(1,4,6) is accepted as a subvector of c(2,1,1,4,6,3), but not of 
 c(2,1,1,6,4,3).
 
 How to test this?

See this discussion for a variety of solutions.

http://r.789695.n4.nabble.com/matching-a-sequence-in-a-vector-td4389523.html#a4393453

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running different Regressions using for loops

2012-09-28 Thread Krunal Nanavati
Hi Rui,

Excellent!!  This is what I was looking for. Thanks for the help.

So, now I have stored the result of the 10 regressions in  summ.list
- lapply(lm.list2, summary)

And now once I enter sum.list it gives me the output for all
the 10 regressions...

I wanted to access a beta coefficient of one of the regressionssay
Price2+Media1+Trend+Seasonality...the result of which is stored in 
sum.list[2] 

I entered the below statement for accessing the Beta coefficient for
Price2...

 summ.list[2]$coefficients[2]
NULL

But this is giving me  NULL  as the output...

What I am looking for, is to access a beta value of a particular variable
from a particular regression output and use it for further analysis.

Can you please help me out with this. Greatly appreciate, you guys
efforts.




Thanks  Regards,

Krunal Nanavati
9769-919198

-Original Message-
From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
Sent: 27 September 2012 21:55
To: Krunal Nanavati
Cc: David Winsemius; r-help@r-project.org
Subject: Re: [R] Running different Regressions using for loops

Hello,

Inline.
Em 27-09-2012 13:52, Krunal Nanavati escreveu:
 Hi,

 Thanks for all your help. I am stuck again, but with a new problem, on
 similar lines.

 I have taken the problem to the next step now...i have now added 2 for
 loops... 1 for the Price variable...and another for the Media variable

 I have taken 5 price variables...and 2 media variables with the trend
 and seasonality(appearing in all of them)so in all there will be
 10 regression to run now

 Price 1, Media 1

 Price 1, Media 2

 Price 2, Media 1'

 Price 2, Media 2

 ...and so on

 I have built up a code for it...




 tryout=read.table(C:\\Users\\Krunal\\Desktop\\R
 tryout.csv,header=T,sep=,)
 cnames - names(tryout)
 price - cnames[grep(Price, cnames)] media - cnames[grep(Media,
 cnames)] resp - cnames[1] regr - cnames[7:8] lm.list -
 vector(list, 10) for(i in 1:5)
 + {
 + regress - paste(price[i], paste(regr, collapse = +), sep = +)
 + for(j in 1:2) {
 + regress1 - paste(media[j],regress,sep=+) fmla - paste(resp,
 + regress1, sep = ~) lm.list[[i]] - lm(as.formula(fmla), data =
 + tryout) } }
 summ.list - lapply(lm.list, summary) summ.list





 But it is only running...5 regressions...only Media 1 along with the 5
 Price variables  Trend  Seasonality is regressed on Volume...giving
 only
 5 outputs

 I feel there is something wrong with the lm.list[[i]] -
 lm(as.formula(fmla), data = tryout)   statement.

No, I don't think so. If it's giving you only 5 outputs the error is
probably in the fmla construction. Put print statements to see the results
of those paste() instructions.

Supposing your data.frame is now called tryout2,


price - paste(Price, 1:5, sep = )
media - paste(Media, 1:2, sep = )
pricemedia - apply(expand.grid(price, media, stringsAsFactors = FALSE),
1, paste, collapse=+)

response - Volume
trendseason - Trend+Seasonality  # do this only once

lm.list2 - list()
for(i in seq_along(pricemedia)){
 regr - paste(pricemedia[i], trendseason, sep = +)
 fmla - paste(response, regr, sep = ~)
 lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) }

The trick is to use ?expand.grid

Hope this helps,

Rui Barradas

   I am not sure about its
 placement...whether it should be in loop 2 or in loop 1

 Can you please help me out??










 Thanks  Regards,

 Krunal Nanavati
 9769-919198

 -Original Message-
 From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
 Sent: 27 September 2012 16:22
 To: David Winsemius
 Cc: Krunal Nanavati; r-help@r-project.org
 Subject: Re: [R] Running different Regressions using for loops

 Hello,

 Just to add that you can also

 lapply(lm.list, coef)

 with a different output.

 Rui Barradas
 Em 27-09-2012 09:24, David Winsemius escreveu:
 On Sep 26, 2012, at 10:31 PM, Krunal Nanavati wrote:

 Dear Rui,

 Thanks for your time.

 I have a question though, when I run the 5 regression, whose outputs
 are stored in lm.list[i], I only get the coefficients for the
 Intercept, Price, Trend  Seasonality as below


 lm.list[1]
 [[1]]

 Call:

 lm(formula = as.formula(fmla), data = tryout)

 Coefficients:

 (Intercept)   Price4Trend  Seasonality

  9923123 -260682664616   551392
 summ.list - lapply(lm.list, summary) coef.list - lapply(summ.list,
 coef) coef.list

 I am also looking out for t stats and p value and R squared.
 For the r.squared

 rsq.vec - sapply(summ.list, $, r.squared) adj.rsq -
 sapply(summ.list, $, adj.r.squared)

 Do you know,
 how can I get all these statistics. Also, why is  as.formula  used
 in the lm function. It should work without that as well, right?
 No.
 Can you please tell me, why the code that I had written, does not
 work with R. I thought it should work perfectly.
 In R there is a difference between expression objects and character
 objects.

 Thanks  Regards,



 Krunal Nanavati

 9769-919198



 *From:* Rui Barradas [mailto:ruipbarra...@sapo.pt]
 

[R] blank plot----how do I make symbols appear

2012-09-28 Thread Jessica da Silva
Hi,

I am trying to create a scatterplot, coding each point to one of 5
populations.  I was successful when I did this for one set of data, yet
when I try plotting other data a blank plot appears (although the axes are
labelled and I can fit the regression lines from each population).  I have
tried a variety of things to fix this but nothing seems to work.

I can plot the points if I do not specify that I want each population to
have a particular symbol. However, once I add the command [grip$Morph] to
my symbol parameter (e.g., pch=c(2,6,5,19,15) [grip$morph] ), I loose all
the points.  As I mentioned above, I was able to  create a plot
successfully using other data points from the same table (different
columns), so I know the data are fine.

Has anyone come across this before?


R-script used:

HAND-AllMal[,c(2,4,5)]
na.omit(HAND)-HAND

write.csv(HAND, grip.csv)

read.csv(grip.csv)-grip
grip
class(grip)
class(HAND)


grip$morph-as.character(grip$Morph)

morph- grip$morph
BML-grip$BML
grip$MCF-MCF


reg1-lm(BML~MCF,data=subset(grip,morph==mel));reg1
reg2-lm(BML~MCF,data=subset(grip,morph==tham));reg2
reg3-lm(BML~MCF,data=subset(grip,morph==A));reg3
reg4-lm(BML~MCF,data=subset(grip,morph==B));reg4
reg5-lm(BML~MCF,data=subset(grip,morph==C));reg5


plot(MCF,BML,pch=c(2,6,5,19,15)[grip$morph],xlab=Residual Metacarpal
Length,ylab=Residual Hand Strength (Broad Dowel), main=Males)
abline(reg1,lty=1)
abline(reg2,lty=2)
abline(reg3,lty=3)
abline(reg4,lty=4)
abline(reg5,lty=6)

-- 
*Jessica da Silva*
PhD Candidate

Molecular Ecology  Evolution Program
Applied Biodiversity Research
Kirstenbosch Research Centre
South African National Biodiversity Institute

Postal address:
3 Sangster Road
Howick, KZN
3290

Home/Fax: +27 33 330 2230
Cell: +27 79 045 1781

Email: jessica.m.dasi...@gmail.com
  j.dasi...@sanbi.org.za

Website: http://jmdasilva.doodlekit.com/home/home

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] blank plot----how do I make symbols appear

2012-09-28 Thread Ken Knoblauch
Jessica da Silva jessica.m.dasilva at gmail.com writes:
 I am trying to create a scatterplot, coding each point to 
one of 5
 populations.  I was successful when I did this for one 
set of data, yet
 when I try plotting other data a blank plot appears 
(although the axes are
 labelled and I can fit the regression lines from each
 population).  I

However, once I add the command [grip$Morph] to
 my symbol parameter (e.g., pch=c(2,6,5,19,15) [grip$morph] ),
 I loose all
 the points.  As I mentioned above, I was able 
to  create a plot
 successfully using other data points from the 
same table (different
 columns), so I know the data are fine.
 

Try  

grip$morph-unclass(grip$Morph)

instead.  Look at what 

as.character(factor(letters[1:3]))

gives you.

 R-script used:
 
 HAND-AllMal[,c(2,4,5)]
 na.omit(HAND)-HAND
 
 write.csv(HAND, grip.csv)
 
 read.csv(grip.csv)-grip
 grip
 class(grip)
 class(HAND)
 
 grip$morph-as.character(grip$Morph)
 
 morph- grip$morph
 BML-grip$BML
 grip$MCF-MCF
 
 reg1-lm(BML~MCF,data=subset(grip,morph==mel));reg1
 reg2-lm(BML~MCF,data=subset(grip,morph==tham));reg2
 reg3-lm(BML~MCF,data=subset(grip,morph==A));reg3
 reg4-lm(BML~MCF,data=subset(grip,morph==B));reg4
 reg5-lm(BML~MCF,data=subset(grip,morph==C));reg5
 
 plot(MCF,BML,pch=c(2,6,5,19,15)[grip$morph],xlab=Residual Metacarpal
 Length,ylab=Residual Hand Strength (Broad Dowel), main=Males)
 abline(reg1,lty=1)
 abline(reg2,lty=2)
 abline(reg3,lty=3)
 abline(reg4,lty=4)
 abline(reg5,lty=6)


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running different Regressions using for loops

2012-09-28 Thread Gerrit Eichner

Hello, Krunal,

try


summ.list[[2]]$coefficients[2]


Note the double square brackets (as summ.list is a list)!

Hth,

Gerrit


On Fri, 28 Sep 2012, Krunal Nanavati wrote:


Hi Rui,

Excellent!!  This is what I was looking for. Thanks for the help.

So, now I have stored the result of the 10 regressions in  summ.list
- lapply(lm.list2, summary)

And now once I enter sum.list it gives me the output for all
the 10 regressions...

I wanted to access a beta coefficient of one of the regressionssay
Price2+Media1+Trend+Seasonality...the result of which is stored in 
sum.list[2] 

I entered the below statement for accessing the Beta coefficient for
Price2...


summ.list[2]$coefficients[2]

NULL

But this is giving me  NULL  as the output...

What I am looking for, is to access a beta value of a particular variable
from a particular regression output and use it for further analysis.



snip

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running different Regressions using for loops

2012-09-28 Thread Rui Barradas

Hello,

To access list elements you need `[[`, like this:

summ.list[[2]]$coefficients

Or Use the extractor function,

coef(summ.list[[2]])

Rui Barradas
Em 28-09-2012 07:23, Krunal Nanavati escreveu:

Hi Rui,

Excellent!!  This is what I was looking for. Thanks for the help.

So, now I have stored the result of the 10 regressions in  summ.list
- lapply(lm.list2, summary)

And now once I enter sum.list it gives me the output for all
the 10 regressions...

I wanted to access a beta coefficient of one of the regressionssay
Price2+Media1+Trend+Seasonality...the result of which is stored in
sum.list[2] 

I entered the below statement for accessing the Beta coefficient for
Price2...


summ.list[2]$coefficients[2]

NULL

But this is giving me  NULL  as the output...

What I am looking for, is to access a beta value of a particular variable
from a particular regression output and use it for further analysis.

Can you please help me out with this. Greatly appreciate, you guys
efforts.




Thanks  Regards,

Krunal Nanavati
9769-919198

-Original Message-
From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
Sent: 27 September 2012 21:55
To: Krunal Nanavati
Cc: David Winsemius; r-help@r-project.org
Subject: Re: [R] Running different Regressions using for loops

Hello,

Inline.
Em 27-09-2012 13:52, Krunal Nanavati escreveu:

Hi,

Thanks for all your help. I am stuck again, but with a new problem, on
similar lines.

I have taken the problem to the next step now...i have now added 2 for
loops... 1 for the Price variable...and another for the Media variable

I have taken 5 price variables...and 2 media variables with the trend
and seasonality(appearing in all of them)so in all there will be
10 regression to run now

Price 1, Media 1

Price 1, Media 2

Price 2, Media 1'

Price 2, Media 2

...and so on

I have built up a code for it...





tryout=read.table(C:\\Users\\Krunal\\Desktop\\R

tryout.csv,header=T,sep=,)

cnames - names(tryout)
price - cnames[grep(Price, cnames)] media - cnames[grep(Media,
cnames)] resp - cnames[1] regr - cnames[7:8] lm.list -
vector(list, 10) for(i in 1:5)

+ {
+ regress - paste(price[i], paste(regr, collapse = +), sep = +)
+ for(j in 1:2) {
+ regress1 - paste(media[j],regress,sep=+) fmla - paste(resp,
+ regress1, sep = ~) lm.list[[i]] - lm(as.formula(fmla), data =
+ tryout) } }

summ.list - lapply(lm.list, summary) summ.list





But it is only running...5 regressions...only Media 1 along with the 5
Price variables  Trend  Seasonality is regressed on Volume...giving
only
5 outputs

I feel there is something wrong with the lm.list[[i]] -
lm(as.formula(fmla), data = tryout)   statement.

No, I don't think so. If it's giving you only 5 outputs the error is
probably in the fmla construction. Put print statements to see the results
of those paste() instructions.

Supposing your data.frame is now called tryout2,


price - paste(Price, 1:5, sep = )
media - paste(Media, 1:2, sep = )
pricemedia - apply(expand.grid(price, media, stringsAsFactors = FALSE),
1, paste, collapse=+)

response - Volume
trendseason - Trend+Seasonality  # do this only once

lm.list2 - list()
for(i in seq_along(pricemedia)){
  regr - paste(pricemedia[i], trendseason, sep = +)
  fmla - paste(response, regr, sep = ~)
  lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) }

The trick is to use ?expand.grid

Hope this helps,

Rui Barradas


   I am not sure about its
placement...whether it should be in loop 2 or in loop 1

Can you please help me out??










Thanks  Regards,

Krunal Nanavati
9769-919198

-Original Message-
From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
Sent: 27 September 2012 16:22
To: David Winsemius
Cc: Krunal Nanavati; r-help@r-project.org
Subject: Re: [R] Running different Regressions using for loops

Hello,

Just to add that you can also

lapply(lm.list, coef)

with a different output.

Rui Barradas
Em 27-09-2012 09:24, David Winsemius escreveu:

On Sep 26, 2012, at 10:31 PM, Krunal Nanavati wrote:


Dear Rui,

Thanks for your time.

I have a question though, when I run the 5 regression, whose outputs
are stored in lm.list[i], I only get the coefficients for the
Intercept, Price, Trend  Seasonality as below



lm.list[1]

[[1]]

Call:

lm(formula = as.formula(fmla), data = tryout)

Coefficients:

(Intercept)   Price4Trend  Seasonality

  9923123 -260682664616   551392

summ.list - lapply(lm.list, summary) coef.list - lapply(summ.list,
coef) coef.list


I am also looking out for t stats and p value and R squared.

For the r.squared

rsq.vec - sapply(summ.list, $, r.squared) adj.rsq -
sapply(summ.list, $, adj.r.squared)


Do you know,
how can I get all these statistics. Also, why is  as.formula  used
in the lm function. It should work without that as well, right?

No.

Can you please tell me, why the code that I had written, does not
work with R. I thought it should work perfectly.

In R there is a difference between 

Re: [R] Drawing asymmetric error bars

2012-09-28 Thread Jim Lemon

On 09/27/2012 08:59 PM, Alexandra Howe wrote:

Hello,

I have data which I have arcsin transformed to analyse.
I want to plot my data with error bars however as my data is
back-transformed my standard errors are uneven.
Is there a simple way to draw these asymmetric error bars in R?


Hi Alexandra,
Have a look at the dispersion function in the plotrix package.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Crosstable-like analysis (ks test) of dataframe

2012-09-28 Thread Johannes Radinger
Hi,

I have a dataframe with multiple (appr. 20) columns containing
vectors of different values (different distributions).
 Now I'd like to create a crosstable
where I compare the distribution of each vector (df-column) with
each other. For the comparison I want to use the ks.test().
The result should contain as row and column names the column names
of the input dataframe and the cells should be populated with
the p-value of the ks.test for each pairwise analysis.

My data.frame looks like:
df - data.frame(X=rnorm(1000,2),Y=rnorm(1000,1),Z=rnorm(1000,2))

And the test for one single case is:
ks - ks.test(df$X,df$Z)

where the p value is:
ks[2]

How can I create an automatized way of this pairwise analysis?
Any suggestions? I guess that is a quite common analysis (probably with
other tests).

cheers,
Johannes

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Crosstable-like analysis (ks test) of dataframe

2012-09-28 Thread Rui Barradas

Hello,

Try the following.


f - function(x, y, ...,
alternative = c(two.sided, less, greater), exact = NULL){
#w - getOption(warn)
#options(warn = -1)  # ignore warnings
p - ks.test(x, y, ..., alternative = alternative, exact = 
exact)$p.value

#options(warn = w)
p
}

n - 1e1
dat - data.frame(X=rnorm(n), Y=runif(n), Z=rchisq(n, df=3))

apply(dat, 2, function(x) apply(dat, 2, function(y) f(x, y)))

Hope this helps,

Rui Barradas
Em 28-09-2012 11:10, Johannes Radinger escreveu:

Hi,

I have a dataframe with multiple (appr. 20) columns containing
vectors of different values (different distributions).
  Now I'd like to create a crosstable
where I compare the distribution of each vector (df-column) with
each other. For the comparison I want to use the ks.test().
The result should contain as row and column names the column names
of the input dataframe and the cells should be populated with
the p-value of the ks.test for each pairwise analysis.

My data.frame looks like:
df - data.frame(X=rnorm(1000,2),Y=rnorm(1000,1),Z=rnorm(1000,2))

And the test for one single case is:
ks - ks.test(df$X,df$Z)

where the p value is:
ks[2]

How can I create an automatized way of this pairwise analysis?
Any suggestions? I guess that is a quite common analysis (probably with
other tests).

cheers,
Johannes

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running different Regressions using for loops

2012-09-28 Thread Rui Barradas

Hello,

Try

names(lm.list2[[2]]$coefficient[2] )

Rui Barradas
Em 28-09-2012 11:29, Krunal Nanavati escreveu:

Ok...this solves a part of my problem

When I typelm.list2[2]  ...I get the following output

[[1]]

Call:
lm(formula = as.formula(fmla), data = tryout2)

Coefficients:
(Intercept)   Price2   Media1  Distri1Trend
Seasonality
13491232 -5759030-15203437048628
445351




When I enterlm.list2[[2]]$coefficient[2]  it gives me the below
output

Price2
-5759030

And when I enterlm.list2[[2]]$coefficient[[2]]  ...I get the
number...which is   -5759030


I am looking out for a way to get just the   Price2 is there a
statement for that??



Thanks  Regards,

Krunal Nanavati
9769-919198


-Original Message-
From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
Sent: 28 September 2012 15:18
To: Krunal Nanavati
Cc: David Winsemius; r-help@r-project.org
Subject: Re: [R] Running different Regressions using for loops

Hello,

To access list elements you need `[[`, like this:

summ.list[[2]]$coefficients

Or Use the extractor function,

coef(summ.list[[2]])

Rui Barradas
Em 28-09-2012 07:23, Krunal Nanavati escreveu:

Hi Rui,

Excellent!!  This is what I was looking for. Thanks for the help.

So, now I have stored the result of the 10 regressions in

summ.list

- lapply(lm.list2, summary)

And now once I enter sum.list it gives me the output for

all

the 10 regressions...

I wanted to access a beta coefficient of one of the regressionssay
Price2+Media1+Trend+Seasonality...the result of which is stored in
sum.list[2] 

I entered the below statement for accessing the Beta coefficient for
Price2...


summ.list[2]$coefficients[2]

NULL

But this is giving me  NULL  as the output...

What I am looking for, is to access a beta value of a particular
variable from a particular regression output and use it for further

analysis.

Can you please help me out with this. Greatly appreciate, you guys
efforts.




Thanks  Regards,

Krunal Nanavati
9769-919198

-Original Message-
From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
Sent: 27 September 2012 21:55
To: Krunal Nanavati
Cc: David Winsemius; r-help@r-project.org
Subject: Re: [R] Running different Regressions using for loops

Hello,

Inline.
Em 27-09-2012 13:52, Krunal Nanavati escreveu:

Hi,

Thanks for all your help. I am stuck again, but with a new problem,
on similar lines.

I have taken the problem to the next step now...i have now added 2

for

loops... 1 for the Price variable...and another for the Media
variable

I have taken 5 price variables...and 2 media variables with the
trend and seasonality(appearing in all of them)so in all there
will be
10 regression to run now

Price 1, Media 1

Price 1, Media 2

Price 2, Media 1'

Price 2, Media 2

...and so on

I have built up a code for it...





tryout=read.table(C:\\Users\\Krunal\\Desktop\\R

tryout.csv,header=T,sep=,)

cnames - names(tryout)
price - cnames[grep(Price, cnames)] media - cnames[grep(Media,
cnames)] resp - cnames[1] regr - cnames[7:8] lm.list -
vector(list, 10) for(i in 1:5)

+ {
+ regress - paste(price[i], paste(regr, collapse = +), sep = +)
+ for(j in 1:2) {
+ regress1 - paste(media[j],regress,sep=+) fmla - paste(resp,
+ regress1, sep = ~) lm.list[[i]] - lm(as.formula(fmla), data =
+ tryout) } }

summ.list - lapply(lm.list, summary) summ.list




But it is only running...5 regressions...only Media 1 along with the
5 Price variables  Trend  Seasonality is regressed on
Volume...giving only
5 outputs

I feel there is something wrong with the lm.list[[i]] -
lm(as.formula(fmla), data = tryout)   statement.

No, I don't think so. If it's giving you only 5 outputs the error is
probably in the fmla construction. Put print statements to see the
results of those paste() instructions.

Supposing your data.frame is now called tryout2,


price - paste(Price, 1:5, sep = ) media - paste(Media, 1:2,
sep = ) pricemedia - apply(expand.grid(price, media,
stringsAsFactors = FALSE), 1, paste, collapse=+)

response - Volume
trendseason - Trend+Seasonality  # do this only once

lm.list2 - list()
for(i in seq_along(pricemedia)){
   regr - paste(pricemedia[i], trendseason, sep = +)
   fmla - paste(response, regr, sep = ~)
   lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) }

The trick is to use ?expand.grid

Hope this helps,

Rui Barradas


I am not sure about its
placement...whether it should be in loop 2 or in loop 1

Can you please help me out??










Thanks  Regards,

Krunal Nanavati
9769-919198

-Original Message-
From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
Sent: 27 September 2012 16:22
To: David Winsemius
Cc: Krunal Nanavati; r-help@r-project.org
Subject: Re: [R] Running different Regressions using for loops

Hello,

Just to add that you can also

lapply(lm.list, coef)

with a different output.

Rui Barradas
Em 27-09-2012 09:24, David Winsemius escreveu:

On Sep 26, 2012, at 

Re: [R] changing outlier shapes of boxplots using lattice

2012-09-28 Thread Sarah Goslee
I would guess that if you find the bit that says pch=| and change it to
pch=1 it will solve your question, and that reading ?par will tell you why.

Sarah

On Thursday, September 27, 2012, Elaine Kuo wrote:

 Hello

 This is Elaine.

 I am using package lattice to generate boxplots.
 Using Richard's code, the display was almost perfect except the outlier
 shape.
 Based on the following code, the outliers are vertical lines.
 However, I want the outliers to be empty circles.
 Please kindly help how to modify the code to change the outlier shapes.
 Thank you.

 code
 package (lattice)

 dataN - data.frame(GE_distance=rnorm(260),

 Diet_B=factor(rep(1:13, each=20)))

 Diet.colors - c(forestgreen, darkgreen,chocolate1,darkorange2,

  sienna2,red2,firebrick3,saddlebrown,coral4,

  chocolate4,darkblue,navy,grey38)

 levels(dataN$Diet_B) - Diet.colors

 bwplot(GE_distance ~ Diet_B, data=dataN,

xlab=list(Diet of Breeding Ground, cex = 1.4),

ylab = list(

  Distance between Centers of B and NB Range (1000 km),

  cex = 1.4),

panel=panel.bwplot.intermediate.hh,

col=Diet.colors,

pch=rep(|,13),

scales=list(x=list(rot=90)),

par.settings=list(box.umbrella=list(lty=1)))

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org javascript:; mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Sarah Goslee
http://www.stringpage.com
http://www.sarahgoslee.com
http://www.functionaldiversity.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running different Regressions using for loops

2012-09-28 Thread Rui Barradas
Ok, if I'm understanding it well, you want the mean value of Price1,   , 
Price5? I don't know if it makes any sense, the coefficients already are 
mean values, but see if this is it.


price.coef - sapply(lm.list, function(x) coef(x)[2])
mean(price.coef)

Rui Barradas
Em 28-09-2012 12:07, Krunal Nanavati escreveu:

Hi,

Yes the thing that you provided...works finebut probably I should have
asked for some other thing.

Here is what I am trying to do

I am trying to get the mean of Price variableso I am entering the
below function:

  mean(names(lm.list2[[2]]$coefficient[2] ))

but this gives me an error

[1] NA
Warning message:
In mean.default(names(lm.list2[[2]]$coefficient[2])) :
argument is not numeric or logical: returning NA

I thought by getting the text from the list variable...will help me
generate the mean for that text...which is a variable in the data...say
Price 1, Media 2and so on

Is this a proper approach...if it is...then something more needs to be
done with the function that you provided.

If not, is there a better way...to generate the mean of a particular
variable inside the  for loop  used earlier...given below:


lm.list2 - list()
for(i in seq_along(pricemedia)){
   regr - paste(pricemedia[i], trendseason, sep = +)
   fmla - paste(response, regr, sep = ~)
   lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) }




Thanks  Regards,

Krunal Nanavati
9769-919198


-Original Message-
From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
Sent: 28 September 2012 16:02
To: Krunal Nanavati
Cc: David Winsemius; r-help@r-project.org
Subject: Re: [R] Running different Regressions using for loops

Hello,

Try

names(lm.list2[[2]]$coefficient[2] )

Rui Barradas
Em 28-09-2012 11:29, Krunal Nanavati escreveu:

Ok...this solves a part of my problem

When I typelm.list2[2]  ...I get the following output

[[1]]

Call:
lm(formula = as.formula(fmla), data = tryout2)

Coefficients:
(Intercept)   Price2   Media1  Distri1Trend
Seasonality
 13491232 -5759030-15203437048628
445351




When I enterlm.list2[[2]]$coefficient[2]  it gives me the below
output

Price2
-5759030

And when I enterlm.list2[[2]]$coefficient[[2]]  ...I get the
number...which is   -5759030


I am looking out for a way to get just the   Price2 is there a
statement for that??



Thanks  Regards,

Krunal Nanavati
9769-919198


-Original Message-
From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
Sent: 28 September 2012 15:18
To: Krunal Nanavati
Cc: David Winsemius; r-help@r-project.org
Subject: Re: [R] Running different Regressions using for loops

Hello,

To access list elements you need `[[`, like this:

summ.list[[2]]$coefficients

Or Use the extractor function,

coef(summ.list[[2]])

Rui Barradas
Em 28-09-2012 07:23, Krunal Nanavati escreveu:

Hi Rui,

Excellent!!  This is what I was looking for. Thanks for the help.

So, now I have stored the result of the 10 regressions in

summ.list

- lapply(lm.list2, summary)

And now once I enter sum.list it gives me the output for

all

the 10 regressions...

I wanted to access a beta coefficient of one of the
regressionssay Price2+Media1+Trend+Seasonality...the result of

which is stored in

sum.list[2] 

I entered the below statement for accessing the Beta coefficient for
Price2...


summ.list[2]$coefficients[2]

NULL

But this is giving me  NULL  as the output...

What I am looking for, is to access a beta value of a particular
variable from a particular regression output and use it for further

analysis.

Can you please help me out with this. Greatly appreciate, you guys
efforts.




Thanks  Regards,

Krunal Nanavati
9769-919198

-Original Message-
From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
Sent: 27 September 2012 21:55
To: Krunal Nanavati
Cc: David Winsemius; r-help@r-project.org
Subject: Re: [R] Running different Regressions using for loops

Hello,

Inline.
Em 27-09-2012 13:52, Krunal Nanavati escreveu:

Hi,

Thanks for all your help. I am stuck again, but with a new problem,
on similar lines.

I have taken the problem to the next step now...i have now added 2

for

loops... 1 for the Price variable...and another for the Media
variable

I have taken 5 price variables...and 2 media variables with the
trend and seasonality(appearing in all of them)so in all there
will be
10 regression to run now

Price 1, Media 1

Price 1, Media 2

Price 2, Media 1'

Price 2, Media 2

...and so on

I have built up a code for it...





tryout=read.table(C:\\Users\\Krunal\\Desktop\\R

tryout.csv,header=T,sep=,)

cnames - names(tryout)
price - cnames[grep(Price, cnames)] media -
cnames[grep(Media, cnames)] resp - cnames[1] regr - cnames[7:8]
lm.list - vector(list, 10) for(i in 1:5)

+ {
+ regress - paste(price[i], paste(regr, collapse = +), sep = +)
+ for(j in 1:2) {
+ regress1 - paste(media[j],regress,sep=+) fmla - paste(resp,
+ 

Re: [R] What to use for ti in back-transforming summary statistics from F-T double square-root transformation in 'metafor'

2012-09-28 Thread Viechtbauer Wolfgang (STAT)
Dear Chunyan,

One possibility would be to use the harmonic mean of the person-time at risk 
values. You will have to do this manually though at the moment. Here is an 
example:

### let's just use the treatment group data from dat.warfarin
data(dat.warfarin)
dat - escalc(xi=x1i, ti=t1i, measure=IRFT, data=dat.warfarin, append=TRUE)
dat

### check if back-transformation of individual IRFT values works
transf.iirft(dat$yi, ti=dat$t1i)
escalc(xi=x1i, ti=t1i, measure=IR, data=dat.warfarin)$yi

### random-effects models
res - rma(yi, vi, data=dat)
res

### harmonic mean of the ti's
ti.hm - 1/(mean(1/dat$t1i))

### back-transformation using the harmonic mean
transf.iirft(res$b, ti=ti.hm)
transf.iirft(res$ci.lb, ti=ti.hm)
transf.iirft(res$ci.ub, ti=ti.hm)

Best,
Wolfgang

--
Wolfgang Viechtbauer, Ph.D., Statistician
Department of Psychiatry and Psychology
School for Mental Health and Neuroscience
Faculty of Health, Medicine, and Life Sciences
Maastricht University, P.O. Box 616 (VIJV1)
6200 MD Maastricht, The Netherlands
+31 (43) 388-4170 | http://www.wvbauer.com

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Liu, Chunyan [chunyan@cchmc.org]
Sent: Thursday, September 27, 2012 10:48 PM
To: r-help@R-project.org
Subject: [R] What to use for ti in back-transforming summary statistics from 
F-T double square-root transformation in 'metafor'

Hi Dr. Viechtbauer,

I'm doing meta-analysis using your package 'metafor'. I used the 'IRFT' to 
transform the incident rate. But when I tried to back-transform the summary 
estimates from function rma, I don't know what's the appropriate ti to feed in 
function transf.iirft. I searched and found your post about using harmonic mean 
for ni  to back-transform the double arcsine  transformation. I'm hoping I can 
get your help on ti too.

Thanks.


Chunyan Liu

513-636-9763
Biostatistician II
Department of Biostatistics and Epidemiology
Cincinnati Children's Hospital Medical Center
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Anova and tukey-grouping

2012-09-28 Thread Landi
Hello,

I am really new to R and it's still a challenge to me.
Currently I'm working on my Master's Thesis. My supervisor works with SAS
and is not familiar with R at all.

I want to run an Anova, a tukey-test and as a result I want to have the
tukey-grouping ( something like A - AB - B)

I came across the HSD.test in the agricolae-package, but... unfortunately I
do not get an output (like here in the answer
http://stats.stackexchange.com/questions/31547/how-to-obtain-the-results-of-a-tukey-hsd-post-hoc-test-in-a-table-showing-groupe
)

I did it like this:

##   ANOVA
anova.typabunmit-aov(ds.typabunmit$abun ~ ds.typabunmit$typ)
summary(anova.typabunmit)
summary.lm(anova.typabunmit)

## post HOC
tukey.typabunmit-TukeyHSD(anova.typabunmit)
tukey.typabunmit

## HSD
HSD.test(anova.typabunmit, abun, group=TRUE)



and the ONLY output is this:
Name:  abun 
 ds.typabunmit$typ 


I would be very pleased about some ides..:!





--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Is it possible to enter in a function wich is within a library ?

2012-09-28 Thread ikuzar
Hello, 

I'd like to know if it is Ipossible to enter in a function wich is included
in a library ?
I know how to debug function wich is in a R file (but not in a library). But
it is not the case when the function is included in a library. I want to go
step by step in this function in order to test objects 'values.  I tried
debug(the_function) but the program does not stop at the_function (it only
shows the body of the function).

Thanks for your help.





--
View this message in context: 
http://r.789695.n4.nabble.com/Is-it-possible-to-enter-in-a-function-wich-is-within-a-library-tp4644488.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running different Regressions using for loops

2012-09-28 Thread Krunal Nanavati
Ok...this solves a part of my problem

When I typelm.list2[2]  ...I get the following output

[[1]]

Call:
lm(formula = as.formula(fmla), data = tryout2)

Coefficients:
(Intercept)   Price2   Media1  Distri1Trend
Seasonality
   13491232 -5759030-15203437048628
445351




When I enterlm.list2[[2]]$coefficient[2]  it gives me the below
output

Price2
-5759030

And when I enterlm.list2[[2]]$coefficient[[2]]  ...I get the
number...which is   -5759030


I am looking out for a way to get just the   Price2 is there a
statement for that??



Thanks  Regards,

Krunal Nanavati
9769-919198


-Original Message-
From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
Sent: 28 September 2012 15:18
To: Krunal Nanavati
Cc: David Winsemius; r-help@r-project.org
Subject: Re: [R] Running different Regressions using for loops

Hello,

To access list elements you need `[[`, like this:

summ.list[[2]]$coefficients

Or Use the extractor function,

coef(summ.list[[2]])

Rui Barradas
Em 28-09-2012 07:23, Krunal Nanavati escreveu:
 Hi Rui,

 Excellent!!  This is what I was looking for. Thanks for the help.

 So, now I have stored the result of the 10 regressions in
summ.list
 - lapply(lm.list2, summary)

 And now once I enter sum.list it gives me the output for
all
 the 10 regressions...

 I wanted to access a beta coefficient of one of the regressionssay
 Price2+Media1+Trend+Seasonality...the result of which is stored in
 sum.list[2] 

 I entered the below statement for accessing the Beta coefficient for
 Price2...

 summ.list[2]$coefficients[2]
 NULL

 But this is giving me  NULL  as the output...

 What I am looking for, is to access a beta value of a particular
 variable from a particular regression output and use it for further
analysis.

 Can you please help me out with this. Greatly appreciate, you guys
 efforts.




 Thanks  Regards,

 Krunal Nanavati
 9769-919198

 -Original Message-
 From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
 Sent: 27 September 2012 21:55
 To: Krunal Nanavati
 Cc: David Winsemius; r-help@r-project.org
 Subject: Re: [R] Running different Regressions using for loops

 Hello,

 Inline.
 Em 27-09-2012 13:52, Krunal Nanavati escreveu:
 Hi,

 Thanks for all your help. I am stuck again, but with a new problem,
 on similar lines.

 I have taken the problem to the next step now...i have now added 2
for
 loops... 1 for the Price variable...and another for the Media
 variable

 I have taken 5 price variables...and 2 media variables with the
 trend and seasonality(appearing in all of them)so in all there
 will be
 10 regression to run now

 Price 1, Media 1

 Price 1, Media 2

 Price 2, Media 1'

 Price 2, Media 2

 ...and so on

 I have built up a code for it...




 tryout=read.table(C:\\Users\\Krunal\\Desktop\\R
 tryout.csv,header=T,sep=,)
 cnames - names(tryout)
 price - cnames[grep(Price, cnames)] media - cnames[grep(Media,
 cnames)] resp - cnames[1] regr - cnames[7:8] lm.list -
 vector(list, 10) for(i in 1:5)
 + {
 + regress - paste(price[i], paste(regr, collapse = +), sep = +)
 + for(j in 1:2) {
 + regress1 - paste(media[j],regress,sep=+) fmla - paste(resp,
 + regress1, sep = ~) lm.list[[i]] - lm(as.formula(fmla), data =
 + tryout) } }
 summ.list - lapply(lm.list, summary) summ.list




 But it is only running...5 regressions...only Media 1 along with the
 5 Price variables  Trend  Seasonality is regressed on
 Volume...giving only
 5 outputs

 I feel there is something wrong with the lm.list[[i]] -
 lm(as.formula(fmla), data = tryout)   statement.
 No, I don't think so. If it's giving you only 5 outputs the error is
 probably in the fmla construction. Put print statements to see the
 results of those paste() instructions.

 Supposing your data.frame is now called tryout2,


 price - paste(Price, 1:5, sep = ) media - paste(Media, 1:2,
 sep = ) pricemedia - apply(expand.grid(price, media,
 stringsAsFactors = FALSE), 1, paste, collapse=+)

 response - Volume
 trendseason - Trend+Seasonality  # do this only once

 lm.list2 - list()
 for(i in seq_along(pricemedia)){
   regr - paste(pricemedia[i], trendseason, sep = +)
   fmla - paste(response, regr, sep = ~)
   lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) }

 The trick is to use ?expand.grid

 Hope this helps,

 Rui Barradas

I am not sure about its
 placement...whether it should be in loop 2 or in loop 1

 Can you please help me out??










 Thanks  Regards,

 Krunal Nanavati
 9769-919198

 -Original Message-
 From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
 Sent: 27 September 2012 16:22
 To: David Winsemius
 Cc: Krunal Nanavati; r-help@r-project.org
 Subject: Re: [R] Running different Regressions using for loops

 Hello,

 Just to add that you can also

 lapply(lm.list, coef)

 with a different output.

 Rui Barradas
 Em 27-09-2012 09:24, David Winsemius escreveu:
 On Sep 26, 2012, at 10:31 PM, Krunal Nanavati wrote:

 Dear 

Re: [R] Running different Regressions using for loops

2012-09-28 Thread Krunal Nanavati
Hi,

Yes the thing that you provided...works finebut probably I should have
asked for some other thing.

Here is what I am trying to do

I am trying to get the mean of Price variableso I am entering the
below function:

 mean(names(lm.list2[[2]]$coefficient[2] ))

but this gives me an error

[1] NA
Warning message:
In mean.default(names(lm.list2[[2]]$coefficient[2])) :
argument is not numeric or logical: returning NA

I thought by getting the text from the list variable...will help me
generate the mean for that text...which is a variable in the data...say
Price 1, Media 2and so on

Is this a proper approach...if it is...then something more needs to be
done with the function that you provided.

If not, is there a better way...to generate the mean of a particular
variable inside the  for loop  used earlier...given below:

 lm.list2 - list()
 for(i in seq_along(pricemedia)){
   regr - paste(pricemedia[i], trendseason, sep = +)
   fmla - paste(response, regr, sep = ~)
   lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) }




Thanks  Regards,

Krunal Nanavati
9769-919198


-Original Message-
From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
Sent: 28 September 2012 16:02
To: Krunal Nanavati
Cc: David Winsemius; r-help@r-project.org
Subject: Re: [R] Running different Regressions using for loops

Hello,

Try

names(lm.list2[[2]]$coefficient[2] )

Rui Barradas
Em 28-09-2012 11:29, Krunal Nanavati escreveu:
 Ok...this solves a part of my problem

 When I typelm.list2[2]  ...I get the following output

 [[1]]

 Call:
 lm(formula = as.formula(fmla), data = tryout2)

 Coefficients:
 (Intercept)   Price2   Media1  Distri1Trend
 Seasonality
 13491232 -5759030-15203437048628
 445351




 When I enterlm.list2[[2]]$coefficient[2]  it gives me the below
 output

 Price2
 -5759030

 And when I enterlm.list2[[2]]$coefficient[[2]]  ...I get the
 number...which is   -5759030


 I am looking out for a way to get just the   Price2 is there a
 statement for that??



 Thanks  Regards,

 Krunal Nanavati
 9769-919198


 -Original Message-
 From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
 Sent: 28 September 2012 15:18
 To: Krunal Nanavati
 Cc: David Winsemius; r-help@r-project.org
 Subject: Re: [R] Running different Regressions using for loops

 Hello,

 To access list elements you need `[[`, like this:

 summ.list[[2]]$coefficients

 Or Use the extractor function,

 coef(summ.list[[2]])

 Rui Barradas
 Em 28-09-2012 07:23, Krunal Nanavati escreveu:
 Hi Rui,

 Excellent!!  This is what I was looking for. Thanks for the help.

 So, now I have stored the result of the 10 regressions in
 summ.list
 - lapply(lm.list2, summary)

 And now once I enter sum.list it gives me the output for
 all
 the 10 regressions...

 I wanted to access a beta coefficient of one of the
 regressionssay Price2+Media1+Trend+Seasonality...the result of
which is stored in
 sum.list[2] 

 I entered the below statement for accessing the Beta coefficient for
 Price2...

 summ.list[2]$coefficients[2]
 NULL

 But this is giving me  NULL  as the output...

 What I am looking for, is to access a beta value of a particular
 variable from a particular regression output and use it for further
 analysis.
 Can you please help me out with this. Greatly appreciate, you guys
 efforts.




 Thanks  Regards,

 Krunal Nanavati
 9769-919198

 -Original Message-
 From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
 Sent: 27 September 2012 21:55
 To: Krunal Nanavati
 Cc: David Winsemius; r-help@r-project.org
 Subject: Re: [R] Running different Regressions using for loops

 Hello,

 Inline.
 Em 27-09-2012 13:52, Krunal Nanavati escreveu:
 Hi,

 Thanks for all your help. I am stuck again, but with a new problem,
 on similar lines.

 I have taken the problem to the next step now...i have now added 2
 for
 loops... 1 for the Price variable...and another for the Media
 variable

 I have taken 5 price variables...and 2 media variables with the
 trend and seasonality(appearing in all of them)so in all there
 will be
 10 regression to run now

 Price 1, Media 1

 Price 1, Media 2

 Price 2, Media 1'

 Price 2, Media 2

 ...and so on

 I have built up a code for it...




 tryout=read.table(C:\\Users\\Krunal\\Desktop\\R
 tryout.csv,header=T,sep=,)
 cnames - names(tryout)
 price - cnames[grep(Price, cnames)] media -
 cnames[grep(Media, cnames)] resp - cnames[1] regr - cnames[7:8]
 lm.list - vector(list, 10) for(i in 1:5)
 + {
 + regress - paste(price[i], paste(regr, collapse = +), sep = +)
 + for(j in 1:2) {
 + regress1 - paste(media[j],regress,sep=+) fmla - paste(resp,
 + regress1, sep = ~) lm.list[[i]] - lm(as.formula(fmla), data =
 + tryout) } }
 summ.list - lapply(lm.list, summary) summ.list



 But it is only running...5 regressions...only Media 1 along with the
 5 Price variables  Trend  Seasonality is 

Re: [R] Running different Regressions using for loops

2012-09-28 Thread Krunal Nanavati
Ok...I am sorry for the misunderstanding

what I am trying to do is


 lm.list2 - list()
 for(i in seq_along(pricemedia)){
regr - paste(pricemedia[i], trendseason, sep = +)
fmla - paste(response, regr, sep = ~)
lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) }


When I run...this set of statementsthe 1st regression to be run, will
have Price 1, Media 1...as X variablesand in the second loop it will
have Price 1  Media 2 

So, what I was thinking is...if I can generate inside the for loopthe
mean for Price 1 and Media 1 during the 1st loopand then mean for
Price 1 and Media 2 during the second loop...and so on...for all the 10
regressions


Is the method that I was trying appropriate...or is there a better method
there...I am sorry for the earlier explanation, I hope this one makes it
more understandable


Thanks for your time...and all the quick replies




Thanks  Regards,

Krunal Nanavati
9769-919198


-Original Message-
From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
Sent: 28 September 2012 16:49
To: Krunal Nanavati
Cc: David Winsemius; r-help@r-project.org
Subject: Re: [R] Running different Regressions using for loops

Ok, if I'm understanding it well, you want the mean value of Price1,   ,
Price5? I don't know if it makes any sense, the coefficients already are
mean values, but see if this is it.

price.coef - sapply(lm.list, function(x) coef(x)[2])
mean(price.coef)

Rui Barradas
Em 28-09-2012 12:07, Krunal Nanavati escreveu:
 Hi,

 Yes the thing that you provided...works finebut probably I should
 have asked for some other thing.

 Here is what I am trying to do

 I am trying to get the mean of Price variableso I am entering the
 below function:

   mean(names(lm.list2[[2]]$coefficient[2] ))

 but this gives me an error

   [1] NA
   Warning message:
   In mean.default(names(lm.list2[[2]]$coefficient[2])) :
   argument is not numeric or logical: returning NA

 I thought by getting the text from the list variable...will help me
 generate the mean for that text...which is a variable in the
 data...say Price 1, Media 2and so on

 Is this a proper approach...if it is...then something more needs to be
 done with the function that you provided.

 If not, is there a better way...to generate the mean of a particular
 variable inside the  for loop  used earlier...given below:

 lm.list2 - list()
 for(i in seq_along(pricemedia)){
regr - paste(pricemedia[i], trendseason, sep = +)
fmla - paste(response, regr, sep = ~)
lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) }



 Thanks  Regards,

 Krunal Nanavati
 9769-919198


 -Original Message-
 From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
 Sent: 28 September 2012 16:02
 To: Krunal Nanavati
 Cc: David Winsemius; r-help@r-project.org
 Subject: Re: [R] Running different Regressions using for loops

 Hello,

 Try

 names(lm.list2[[2]]$coefficient[2] )

 Rui Barradas
 Em 28-09-2012 11:29, Krunal Nanavati escreveu:
 Ok...this solves a part of my problem

 When I typelm.list2[2]  ...I get the following output

 [[1]]

 Call:
 lm(formula = as.formula(fmla), data = tryout2)

 Coefficients:
 (Intercept)   Price2   Media1  Distri1Trend
 Seasonality
  13491232 -5759030-15203437048628
 445351




 When I enterlm.list2[[2]]$coefficient[2]  it gives me the below
 output

 Price2
 -5759030

 And when I enterlm.list2[[2]]$coefficient[[2]]  ...I get the
 number...which is   -5759030


 I am looking out for a way to get just the   Price2 is there a
 statement for that??



 Thanks  Regards,

 Krunal Nanavati
 9769-919198


 -Original Message-
 From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
 Sent: 28 September 2012 15:18
 To: Krunal Nanavati
 Cc: David Winsemius; r-help@r-project.org
 Subject: Re: [R] Running different Regressions using for loops

 Hello,

 To access list elements you need `[[`, like this:

 summ.list[[2]]$coefficients

 Or Use the extractor function,

 coef(summ.list[[2]])

 Rui Barradas
 Em 28-09-2012 07:23, Krunal Nanavati escreveu:
 Hi Rui,

 Excellent!!  This is what I was looking for. Thanks for the help.

 So, now I have stored the result of the 10 regressions in
 summ.list
 - lapply(lm.list2, summary)

 And now once I enter sum.list it gives me the output for
 all
 the 10 regressions...

 I wanted to access a beta coefficient of one of the
 regressionssay Price2+Media1+Trend+Seasonality...the result of
 which is stored in
 sum.list[2] 

 I entered the below statement for accessing the Beta coefficient for
 Price2...

 summ.list[2]$coefficients[2]
 NULL

 But this is giving me  NULL  as the output...

 What I am looking for, is to access a beta value of a particular
 variable from a particular regression output and use it for further
 analysis.
 Can you please help me out with this. Greatly appreciate, you guys
 efforts.

[R] RES: Generating an autocorrelated binary variable

2012-09-28 Thread André Gabriel
I think the package BinarySimCLF can help.
See http://cran.r-project.org/web/packages/binarySimCLF/binarySimCLF.pdf.


André Gabriel.



-Mensagem original-
De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Em
nome de Rolf Turner
Enviada em: sexta-feira, 28 de setembro de 2012 00:02
Para: Simon Zehnder
Cc: r help
Assunto: Re: [R] Generating an autocorrelated binary variable


I have no idea what your code is doing, nor why you want correlated binary
variables.  Correlation makes little or no sense in the context of binary
random variables --- or more generally in the context of discrete random
variables.

Be that as it may, it is an easy calculation to show that if X and Y are
binary random variables both with success probability of 0.5 then cor(X,Y) =
0.2 if and only if Pr(X=1 | Y = 1) = 0.6.  So just generate X and Y using
that
fact:

set.seed(42)
X - numeric(1000)
Y - numeric(1000)
for(i in 1:1000) {
Y[i] - rbinom(1,1,0.5)
X[i] - if(Y[i]==1) rbinom(1,1,0.6) else rbinom(1,1,0.4) }

# Check:
cor(X,Y) # Get 0.2012336

Looks about right.  Note that the sample proportions are 0.484 and
0.485 for X and Y respectively.  These values do not differ significantly
from 0.5.

 cheers,

 Rolf Turner

On 28/09/12 08:26, Simon Zehnder wrote:
 Hi R-fellows,

 I am trying to simulate a multivariate correlated sample via the Gaussian
copula method. One variable is a binary variable, that should be
autocorrelated. The autocorrelation should be rho = 0.2. Furthermore, the
overall probability to get either outcome of the binary variable should be
0.5.
 Below you can see the R code (I use for simplicity a diagonal matrix in
rmvnorm even if it produces no correlated sample):

 sampleCop - function(n = 1000, rho = 0.2) {
   
   require(splus2R)
   mvrs - rmvnorm(n + 1, mean = rep(0, 3), cov = diag(3))
   pmvrs - pnorm(mvrs, 0, 1)
   var1 - matrix(0, nrow = n + 1, ncol = 1)
   var1[1] - qbinom(pmvrs[1, 1], 1, 0.5)
   if(var1[1] == 0) var1[nrow(mvrs)] - -1
   for(i in  1:(nrow(pmvrs) - 1)) {
   if(pmvrs[i + 1, 1] = rho) var1[i + 1] - var1[i]
   else var1[i + 1] - var1[i] * (-1)
   }
   sample - matrix(0, nrow = n, ncol = 4)
   sample[, 1] - var1[1:nrow(var1) - 1]
   sample[, 2] - var1[2:nrow(var1)]
   sample[, 3] - qnorm(pmvrs[1:nrow(var1) - 1, 2], 0, 1, 1, 0)
   sample[, 4] - qnorm(pmvrs[1:nrow(var1) - 1, 3], 0, 1, 1, 0)
   
   sample
   
 }

 Now, the code is fine, everything compiles. But when I compute the
autocorrelation of the binary variable, it is not 0.2, but 0.6. Does anyone
know why this happens?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to write R package

2012-09-28 Thread Duncan Murdoch

On 27/09/2012 5:15 PM, Dr. Alireza Zolfaghari wrote:

Hi List,
Would you please send me a good link to talk me through on how to write a R
package?



See the ?package.skeleton help page.  After you have run it, follow the 
instructions in the Read-and-delete-me file that it will create.


For full details, see the Writing R Extensions manual.

For modifying the package after you've finished the Read-and-delete-me 
instructions, just manually add *.R files where the rest of them are, 
and use the prompt() function to produce skeleton documentation.


That's about it, but you can read more if you like in a tutorial I gave 
a few years ago at a UseR meeting in Dortmund:


http://www.statistik.uni-dortmund.de/useR-2008/slides/Murdoch.pdf

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] changing outlier shapes of boxplots using lattice

2012-09-28 Thread Richard M. Heiberger
Elaine,

For panel.bwplot you see that the central dot and the outlier dots are
controlled by
the same pch argument.  I initially set the pch=| to match your first
example with the horizontal
indicator for the median.  I would be inclined to use the default circle
for the outliers and
therefore also for the median.

Rich

On Fri, Sep 28, 2012 at 7:13 AM, Sarah Goslee sarah.gos...@gmail.comwrote:

 I would guess that if you find the bit that says pch=| and change it to
 pch=1 it will solve your question, and that reading ?par will tell you why.

 Sarah

 On Thursday, September 27, 2012, Elaine Kuo wrote:

  Hello
 
  This is Elaine.
 
  I am using package lattice to generate boxplots.
  Using Richard's code, the display was almost perfect except the outlier
  shape.
  Based on the following code, the outliers are vertical lines.
  However, I want the outliers to be empty circles.
  Please kindly help how to modify the code to change the outlier shapes.
  Thank you.
 
  code
  package (lattice)
 
  dataN - data.frame(GE_distance=rnorm(260),
 
  Diet_B=factor(rep(1:13, each=20)))
 
  Diet.colors - c(forestgreen, darkgreen,chocolate1,darkorange2,
 
   sienna2,red2,firebrick3,saddlebrown,coral4,
 
   chocolate4,darkblue,navy,grey38)
 
  levels(dataN$Diet_B) - Diet.colors
 
  bwplot(GE_distance ~ Diet_B, data=dataN,
 
 xlab=list(Diet of Breeding Ground, cex = 1.4),
 
 ylab = list(
 
   Distance between Centers of B and NB Range (1000 km),
 
   cex = 1.4),
 
 panel=panel.bwplot.intermediate.hh,
 
 col=Diet.colors,
 
 pch=rep(|,13),
 
 scales=list(x=list(rot=90)),
 
 par.settings=list(box.umbrella=list(lty=1)))
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org javascript:; mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
 Sarah Goslee
 http://www.stringpage.com
 http://www.sarahgoslee.com
 http://www.functionaldiversity.org

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-28 Thread arun
HI,

I guess there is a mistake in your code.  You should have used typ instead of 
abun as abun is the dependent variable.
summary(fm1 - aov(breaks ~ wool + tension, data = warpbreaks))
myresults    -  TukeyHSD(fm1, tension, ordered = TRUE)
library(agricolae)

HSD.test(fm1,wool,group=TRUE)
#Study:
#HSD Test for breaks 
#Mean Square Error:  134.9578 
#wool,  means
#    breaks  std.err replication
#A 31.03704 3.050609  27
#B 25.25926 1.789963  27
#alpha: 0.05 ; Df Error: 50 
#Critical Value of Studentized Range: 2.840532 
#Honestly Significant Difference: 6.350628 
#Means with the same letter are not significantly different.
#Groups, Treatments and means
#a      A      31.037037037037 
#a      B      25.2592592592593 

 

A.K.



- Original Message -
From: Landi ent-ar...@gmx.de
To: r-help@r-project.org
Cc: 
Sent: Friday, September 28, 2012 5:41 AM
Subject: [R] Anova and tukey-grouping

Hello,

I am really new to R and it's still a challenge to me.
Currently I'm working on my Master's Thesis. My supervisor works with SAS
and is not familiar with R at all.

I want to run an Anova, a tukey-test and as a result I want to have the
tukey-grouping ( something like A - AB - B)

I came across the HSD.test in the agricolae-package, but... unfortunately I
do not get an output (like here in the answer
http://stats.stackexchange.com/questions/31547/how-to-obtain-the-results-of-a-tukey-hsd-post-hoc-test-in-a-table-showing-groupe
)

I did it like this:

##   ANOVA
anova.typabunmit-aov(ds.typabunmit$abun ~ ds.typabunmit$typ)
summary(anova.typabunmit)
summary.lm(anova.typabunmit)

## post HOC
tukey.typabunmit-TukeyHSD(anova.typabunmit)
tukey.typabunmit

## HSD
HSD.test(anova.typabunmit, abun, group=TRUE)



and the ONLY output is this:
Name:  abun 
ds.typabunmit$typ 


I would be very pleased about some ides..:!





--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Crosstable-like analysis (ks test) of dataframe

2012-09-28 Thread Johannes Radinger
Thank you Rui!

that works as I want it... :)

/Johannes

On Fri, Sep 28, 2012 at 12:30 PM, Rui Barradas ruipbarra...@sapo.pt wrote:
 Hello,

 Try the following.


 f - function(x, y, ...,
 alternative = c(two.sided, less, greater), exact = NULL){
 #w - getOption(warn)
 #options(warn = -1)  # ignore warnings
 p - ks.test(x, y, ..., alternative = alternative, exact =
 exact)$p.value
 #options(warn = w)
 p
 }

 n - 1e1
 dat - data.frame(X=rnorm(n), Y=runif(n), Z=rchisq(n, df=3))

 apply(dat, 2, function(x) apply(dat, 2, function(y) f(x, y)))

 Hope this helps,

 Rui Barradas
 Em 28-09-2012 11:10, Johannes Radinger escreveu:

 Hi,

 I have a dataframe with multiple (appr. 20) columns containing
 vectors of different values (different distributions).
   Now I'd like to create a crosstable
 where I compare the distribution of each vector (df-column) with
 each other. For the comparison I want to use the ks.test().
 The result should contain as row and column names the column names
 of the input dataframe and the cells should be populated with
 the p-value of the ks.test for each pairwise analysis.

 My data.frame looks like:
 df - data.frame(X=rnorm(1000,2),Y=rnorm(1000,1),Z=rnorm(1000,2))

 And the test for one single case is:
 ks - ks.test(df$X,df$Z)

 where the p value is:
 ks[2]

 How can I create an automatized way of this pairwise analysis?
 Any suggestions? I guess that is a quite common analysis (probably with
 other tests).

 cheers,
 Johannes

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lattice bwplot(): Conditioning on one factor

2012-09-28 Thread Rich Shepard

  I'm not able to create the proper syntax to specify a lattice bwplot() for
only one of two conditioning factors.

  The syntax that produces a box plot of each of the two conditioning
factors is:

bwplot(quant ~ param | era, data=mg.d, main='Dissolved Magnesium', 
ylab='Concentration (mg/L)')

  What I've tried unsuccessfully are:

bwplot(quant ~ param | factor(era=='Pre-mining'), data=mg.d,
main='Magnesium', ylab='Concentration (mg/L))

bwplot(quant ~ param | era, data=mg.d, main='Magnesium', ylab='Concentration
(mg/L)', subset=era('Pre-mining'))

plus slight variations of the above. None work.

  Please point me to what I've missed in specifying only one of two
conditioning factors for the plot.

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple Question

2012-09-28 Thread Bhupendrasinh Thakre
Many thanks Dr. Winsemius , Kimmo and Pascal
All of them are working and really beautiful...

Best Regards,


Bhupendrasinh Thakre

*Disclaimer :*

The information contained in this communication is confidential and may be
legally privileged. It is intended solely for the use of the individual or
entity to whom it is adressed. If you are not the intended recipient you
are hereby (a) notified that any disclosure, copying, distribution or
taking any action with respect to the content of this information is
strictly prohibited and may be unlawful, and (b) kindly requested to inform
the sender immediately and destroy any copies.



On Fri, Sep 28, 2012 at 1:36 AM, David Winsemius dwinsem...@comcast.netwrote:


 On Sep 27, 2012, at 11:13 PM, Bhupendrasinh Thakre wrote:

 
  Hi Everyone,
 
  I am trying a very simple task to append the Timestamp with a variable
 name so something like
  a_2012_09_27_00_12_30 - rnorm(1,2,1).

 If you want to assign a value to a character-name you need to use ...
 `assign`. You cannot just stick a numeric value which is what you get with
 sys.Time() on the LHS of a - and expect R to intuit what you intend.

 ?assign
 assign( a_2012_09_27_00_12_30 ,  rnorm(1,2,1) )
 assign( as.character(unclass(Sys.time())) ,  rnorm(1,2,1) )

 (I would have thought you wanted to format that sys.Time result:)

  format(Sys.time(), %Y_%m_%d_%H_%M_%S)
 [1] 2012_09_27_23_32_40

   assign(format(Sys.time(), %Y_%m_%d_%H_%M_%S),  rnorm(1,2,1) )
  grep(^2012, ls(), value=TRUE)
 [1] 2012_09_27_23_33_45


 
  Tried some commands but it doesn't work out well. Hope someone has some
 answer on it.
 
  Session Info
 
  R version 2.15.1 (2012-06-22)
  Platform: i386-apple-darwin9.8.0/i386 (32-bit)
 
  locale:
  [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
 
  attached base packages:
  [1] stats graphics  grDevices utils datasets  methods   base
 
  other attached packages:
  [1] chron_2.3-42twitteR_0.99.19 rjson_0.2.9 RCurl_1.91-1
  bitops_1.0-4.1  tm_0.5-7.1  RMySQL_0.9-3DBI_0.2-5
 
  loaded via a namespace (and not attached):
  [1] slam_0.1-24  tools_2.15.1
 
  Statement I tried :
 
  b - unclass(Sys.time())
  b = 1348812597
  c_b - rnorm(1,2,1)
 
  Works perfect but doesn't show me c_1348812597.
 
  Best Regards,
 
 
  Bhupendrasinh Thakre
[[alternative HTML version deleted]]

 BT; Please learn to post in plain text. It's really very simple with gmail.

 --

 David Winsemius, MD
 Alameda, CA, USA



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-hpc] Quickest way to make a large empty file on disk?

2012-09-28 Thread jens . oehlschlaegel

   Jonathan,
   ff has a utility function file.resize() which allows to give a new filesize
   in bytes using doubles.
   See ?file.resize
   Regards
   Jens Oehlschlägel
   Gesendet: Donnerstag, 27. September 2012 um 21:17 Uhr
   Von: Jonathan Greenberg j...@illinois.edu
   An: r-help r-help@r-project.org, r-sig-...@r-project.org
   Betreff: Re: [R-sig-hpc] Quickest way to make a large empty file on disk?
   Folks:
   Asked this question some time ago, and found what appeared (at first) to be
   the best solution, but I'm now finding a new problem. First off, it seemed
   like ff as Jens suggested worked:
   # outdata_ncells = the number of rows * number of columns * number of bands
   in an image:
   out-ff(vmode=double,length=outdata_ncells,filename=filename)
   finalizer(out) - close
   close(out)
   This was working fine until I attempted to set length to a VERY large
   number: outdata_ncells = 17711913600. This would create a file that is
   131.964GB. Big, but not obscenely so (and certainly not larger than the
   filesystem can handle). However, length appears to be restricted
   by .Machine$integer.max (I'm on a 64-bit windows box):
.Machine$integer.max
   [1] 2147483647
   Any suggestions on how to solve this problem for much larger file sizes?
   --j
   OnThu,   May   3,   2012   at   10:44   AM,   Jonathan   Greenberg
   j...@illinois.eduwrote:
Thanks, all! I'll try these out. I'm trying to work up something that is
platform independent (if possible) for use with mmap. I'll do some tests
on these suggestions and see which works best. I'll try to report back in
   a
few days. Cheers!
   
--j
   
   
   
2012/5/3 Jens Oehlschlägel jens.oehlschlae...@truecluster.com
   
Jonathan,
   
On some filesystems (e.g. NTFS, see below) it is possible to create
'sparse' memory-mapped files, i.e. reserving the space without the cost
   of
actually writing initial values.
Package 'ff' does this automatically and also allows to access the file
in parallel. Check the example below and see how big file creation is
immediate.
   
Jens Oehlschlägel
   
   
 library(ff)
 library(snowfall)
 ncpus - 2
 n - 1e8
 system.time(
+ x - ff(vmode=double, length=n, filename=c:/Temp/x.ff)
+ )
User System verstrichen
0.01 0.00 0.02
 # check finalizer, with an explicit filename we should have a 'close'
finalizer
 finalizer(x)
[1] close
 # if not, set it to 'close' inorder to not let slaves delete x on slave
shutdown
 finalizer(x) - close
 sfInit(parallel=TRUE, cpus=ncpus, type=SOCK)
R Version: R version 2.15.0 (2012-03-30)
   
snowfall 1.84 initialized (using snow 0.3-9): parallel execution on 2
CPUs.
   
 sfLibrary(ff)
Library ff loaded.
Library ff loaded in cluster.
   
Warnmeldung:
In library(package = ff, character.only = TRUE, pos = 2, warn.conflicts
= TRUE, :
'keep.source' is deprecated and will be ignored
 sfExport(x) # note: do not export the same ff multiple times
 # explicitely opening avoids a gc problem
 sfClusterEval(open(x, caching=mmeachflush)) # opening with
'mmeachflush' inststead of 'mmnoflush' is a bit slower but prevents OS
write storms when the file is larger than RAM
[[1]]
[1] TRUE
   
[[2]]
[1] TRUE
   
 system.time(
+ sfLapply( chunk(x, length=ncpus), function(i){
+ x[i] - runif(sum(i))
+ invisible()
+ })
+ )
User System verstrichen
0.00 0.00 30.78
 system.time(
+ s - sfLapply( chunk(x, length=ncpus), function(i) quantile(x[i],
c(0.05, 0.95)) )
+ )
User System verstrichen
0.00 0.00 4.38
 # for completeness
 sfClusterEval(close(x))
[[1]]
[1] TRUE
   
[[2]]
[1] TRUE
   
 csummary(s)
5% 95%
Min. 0.04998 0.95
1st Qu. 0.04999 0.95
Median 0.05001 0.95
Mean 0.05001 0.95
3rd Qu. 0.05002 0.95
Max. 0.05003 0.95
 # stop slaves
 sfStop()
   
Stopping cluster
   
 # with the close finalizer we are responsible for deleting the file
explicitely (unless we want to keep it)
 delete(x)
[1] TRUE
 # remove r-side metadata
 rm(x)
 # truly free memory
 gc()
   
   
   
*Gesendet:* Donnerstag, 03. Mai 2012 um 00:23 Uhr
*Von:* Jonathan Greenberg j...@illinois.edu
*An:* r-help r-help@r-project.org, r-sig-...@r-project.org
*Betreff:* [R-sig-hpc] Quickest way to make a large empty file on
disk?
R-helpers:
   
What would be the absolute fastest way to make a large empty file (e.g.
filled with all zeroes) on disk, given a byte size and a given number
number of empty values. I know I can use writeBin, but the object in
 this case may be far too large to store in main memory. I'm asking
   because
I'm going to use this file in conjunction with mmap to do parallel writes
to this file. Say, I want to create a blank file of 10,000 

Re: [R] Anova and tukey-grouping

2012-09-28 Thread Landi
Hello !

Thanks for your advice. I tried it, but the output is the same:
 HSD.test(anova.typabunmit, typ, group=TRUE)
Name:  typ 
 ds.typabunmit$typ 

I don't get the values...!?!?



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644513.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] List of Variables in Original Order

2012-09-28 Thread rkulp
AK: Thanks, that was very helpful. It led me to think of  the function 
names(base) which provided the vector of names in the correct order. I 
then used the same matrix formatting and everything worked out exactly 
as planned.
Dick
On 9/28/2012 1:09 AM, arun kirshna [via R] wrote:


 HI,
 May be this helps you:
 set.seed(1)
  mat1-matrix(rnorm(60,5),nrow=5,ncol=12)
 colnames(mat1)-paste0(Var,1:12)
 vec2-format(c(1,cor(mat1[,1],mat1[,2:12])),digits=4)
 vec3-colnames(mat1)
 arr2-array(rbind(vec3,vec2),dim=c(2,3,4))
 res-data.frame(do.call(rbind,lapply(1:dim(arr2)[3],function(i) 
 arr2[,,i])))
  res
 #X1   X2   X3
 #1 Var1 Var2 Var3
 #2  1.0  0.27890 -0.61497
 #3 Var4 Var5 Var6
 #4  0.24916 -0.76155  0.30853
 #5 Var7 Var8 Var9
 #6 -0.46413  0.79287  0.05191
 #7Var10Var11Var12
 #8 -0.06940 -0.53251  0.06766

 A.K.


 - Original Message -
 From: rkulp [hidden email] 
 /user/SendEmail.jtp?type=nodenode=4644469i=0
 To: [hidden email] /user/SendEmail.jtp?type=nodenode=4644469i=1
 Cc:
 Sent: Thursday, September 27, 2012 6:26 PM
 Subject: [R] List of Variables in Original Order

 I am trying to Sweave the output of calculating correlations between one
 variable and several others. I wanted to print a table where the
 odd-numbered rows contain the variable names and the even-numbered rows
 contain the correlations. So if VarA is correlated with all the 
 variables in
 mydata.df, then it would look like

 var1var2  var3
 corr1  corr2 corr3
 var4   var5var6
 corr4 corr5 corr6
 .
 .
 etc.
 I tried using a matrix for the correlations and another one for the 
 variable
 names. I built the correlation matrix using
 x = matrix(format(cor(mydata.df[,1],mydata.df[,c(2:79)]),digits=4),nc=3)
 and the variable names matrix using
 y = matrix(ls(mydata.df[c(2:79)]),nc=3).
 The problem is the function ls returns the names in alphabetical order,
 columnar order.
 How do I get the names in columnar order? Is there a better way to 
 display
 the correlation of a single variable with a large number of other 
 variables?
 If there is, how do I do it? I appreciate any help I can get. This is my
 first project in R so I don't know much about it yet.



 -- 
 View this message in context: 
 http://r.789695.n4.nabble.com/List-of-Variables-in-Original-Order-tp4644436.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 [hidden email] /user/SendEmail.jtp?type=nodenode=4644469i=2 
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 [hidden email] /user/SendEmail.jtp?type=nodenode=4644469i=3 
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 
 If you reply to this email, your message will be added to the 
 discussion below:
 http://r.789695.n4.nabble.com/List-of-Variables-in-Original-Order-tp4644436p4644469.html
  

 To unsubscribe from List of Variables in Original Order, click here 
 http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4644436code=cmt1bHBAY2hhcnRlci5uZXR8NDY0NDQzNnwxOTU3MDkxNDkw.
 NAML 
 http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
  




rkulp.vcf (418 bytes) 
http://r.789695.n4.nabble.com/attachment/4644516/0/rkulp.vcf




--
View this message in context: 
http://r.789695.n4.nabble.com/List-of-Variables-in-Original-Order-tp4644436p4644516.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running different Regressions using for loops

2012-09-28 Thread David Winsemius

On Sep 28, 2012, at 4:35 AM, Krunal Nanavati wrote:

 Ok...I am sorry for the misunderstanding
 
 what I am trying to do is

Perhaps (and that is a really large 'perhaps'):

 lm.list2 - list()
lm.means - list()
 for(i in seq_along(pricemedia)){
   regr - paste(pricemedia[i], trendseason, sep = +)
   fmla - paste(response, regr, sep = ~)
   lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) }
   lm.means[[i]]  - mean(lm.list2[[i]]$coefficients[c(Price1, 
Media1)]
}



 
 When I run...this set of statementsthe 1st regression to be run, will
 have Price 1, Media 1...as X variablesand in the second loop it will
 have Price 1  Media 2 
 
 So, what I was thinking is...if I can generate inside the for loopthe
 mean for Price 1 and Media 1 during the 1st loopand then mean for
 Price 1 and Media 2 during the second loop...and so on...for all the 10
 regressions
 
 
 Is the method that I was trying appropriate...or is there a better method
 there...I am sorry for the earlier explanation, I hope this one makes it
 more understandable

One generally want ones methods to be determinate while allowing the results to 
be approximate.

Had you followed the posting guide a offered a reproducible example it would 
have been much more understandable.


 
 
 Thanks for your time...and all the quick replies
 
 
 -Original Message-
 From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
 Sent: 28 September 2012 16:49
 To: Krunal Nanavati
 Cc: David Winsemius; r-help@r-project.org
 Subject: Re: [R] Running different Regressions using for loops
 
 Ok, if I'm understanding it well, you want the mean value of Price1,   ,
 Price5? I don't know if it makes any sense, the coefficients already are
 mean values, but see if this is it.
 
 price.coef - sapply(lm.list, function(x) coef(x)[2])
 mean(price.coef)
 
 Rui Barradas
 Em 28-09-2012 12:07, Krunal Nanavati escreveu:
 Hi,
 
 Yes the thing that you provided...works finebut probably I should
 have asked for some other thing.
 
 Here is what I am trying to do
 
 I am trying to get the mean of Price variableso I am entering the
 below function:
 
  mean(names(lm.list2[[2]]$coefficient[2] ))
 
 but this gives me an error
 
  [1] NA
  Warning message:
  In mean.default(names(lm.list2[[2]]$coefficient[2])) :
  argument is not numeric or logical: returning NA
 
 I thought by getting the text from the list variable...will help me
 generate the mean for that text...which is a variable in the
 data...say Price 1, Media 2and so on
 
 Is this a proper approach...if it is...then something more needs to be
 done with the function that you provided.
 
 If not, is there a better way...to generate the mean of a particular
 variable inside the  for loop  used earlier...given below:
 
 lm.list2 - list()
 for(i in seq_along(pricemedia)){
   regr - paste(pricemedia[i], trendseason, sep = +)
   fmla - paste(response, regr, sep = ~)
   lm.list2[[i]] - lm(as.formula(fmla), data = tryout2) }
 
 
 
 Thanks  Regards,
 
 Krunal Nanavati
 9769-919198
 
 
 -Original Message-
 From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
 Sent: 28 September 2012 16:02
 To: Krunal Nanavati
 Cc: David Winsemius; r-help@r-project.org
 Subject: Re: [R] Running different Regressions using for loops
 
 Hello,
 
 Try
 
 names(lm.list2[[2]]$coefficient[2] )
 
 Rui Barradas
 Em 28-09-2012 11:29, Krunal Nanavati escreveu:
 Ok...this solves a part of my problem
 
 When I typelm.list2[2]  ...I get the following output
 
 [[1]]
 
 Call:
 lm(formula = as.formula(fmla), data = tryout2)
 
 Coefficients:
 (Intercept)   Price2   Media1  Distri1Trend
 Seasonality
 13491232 -5759030-15203437048628
 445351
 
 
 
 
 When I enterlm.list2[[2]]$coefficient[2]  it gives me the below
 output
 
 Price2
 -5759030
 
 And when I enterlm.list2[[2]]$coefficient[[2]]  ...I get the
 number...which is   -5759030
 
 
 I am looking out for a way to get just the   Price2 is there a
 statement for that??
 
 
 
 Thanks  Regards,
 
 Krunal Nanavati
 9769-919198
 
 
 -Original Message-
 From: Rui Barradas [mailto:ruipbarra...@sapo.pt]
 Sent: 28 September 2012 15:18
 To: Krunal Nanavati
 Cc: David Winsemius; r-help@r-project.org
 Subject: Re: [R] Running different Regressions using for loops
 
 Hello,
 
 To access list elements you need `[[`, like this:
 
 summ.list[[2]]$coefficients
 
 Or Use the extractor function,
 
 coef(summ.list[[2]])
 
 Rui Barradas
 Em 28-09-2012 07:23, Krunal Nanavati escreveu:
 Hi Rui,
 
 Excellent!!  This is what I was looking for. Thanks for the help.
 
 So, now I have stored the result of the 10 regressions in
 summ.list
 - lapply(lm.list2, summary)
 
 And now once I enter sum.list it gives me the output for
 all
 the 10 regressions...
 
 I wanted to access a beta coefficient of one of the
 regressionssay 

[R] max summary contradict each other

2012-09-28 Thread Sam Steingold
why does summary report max 27600 and not 27603?

 x - c(27603, 1)
 max(x)
[1] 27603
 summary(x)
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
  16902   13800   13800   20700   27600 

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://memri.org http://pmw.org.il
http://dhimmi.com http://iris.org.il http://mideasttruth.com
Vegetarians eat Vegetables, Humanitarians are scary.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] max summary contradict each other

2012-09-28 Thread Duncan Murdoch

On 28/09/2012 12:14 PM, Sam Steingold wrote:

why does summary report max 27600 and not 27603?

 x - c(27603, 1)
 max(x)
[1] 27603
 summary(x)
Min. 1st Qu.  MedianMean 3rd Qu.Max.
   16902   13800   13800   20700   27600



Because you asked for 3 digit accuracy.  See ?summary.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice bwplot(): Conditioning on one factor

2012-09-28 Thread David Winsemius

On Sep 28, 2012, at 7:49 AM, Rich Shepard wrote:

  I'm not able to create the proper syntax to specify a lattice bwplot() for
 only one of two conditioning factors.

Wouldn't that involve specifying the 'subset' parameter (if bwplot accepts a 
subset argument) or using the 'subset' function to pass the desired rows to the 
data argument if it doesn't?

 
  The syntax that produces a box plot of each of the two conditioning
 factors is:
 
 bwplot(quant ~ param | era, data=mg.d, main='Dissolved Magnesium', 
 ylab='Concentration (mg/L)')
 
  What I've tried unsuccessfully are:
 
 bwplot(quant ~ param | factor(era=='Pre-mining'), data=mg.d,
 main='Magnesium', ylab='Concentration (mg/L))
 
 bwplot(quant ~ param | era, data=mg.d, main='Magnesium', ylab='Concentration
 (mg/L)', subset=era('Pre-mining'))
 
 plus slight variations of the above. None work.
 
  Please point me to what I've missed in specifying only one of two
 conditioning factors for the plot.
 
 
-- 

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple Question

2012-09-28 Thread Bhupendrasinh Thakre
Hi Everyone,

Sorry for coming back again with a new problem.
Editing question, session info and data so you don't have to scroll till
the end of page.

*Situation :*

I have a data frame and it's name is df. Now I want to add Time Stamp to
the end of *name of data Frame i.e. df_system_time*. Previously it
was running great and thanks to Dr. Winsemius , Kimmo and Pascal and I
believe as the function which i used was scalar.

*Data :*

dput(df)structure(list(x = 1:10, y = 1:10), .Names = c(x, y),
row.names = c(NA,
-10L), class = data.frame)

*Session Info :*

R version 2.15.1 (2012-06-22)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices datasets  utils methods
[7] base

other attached packages:
[1] rcom_2.2-5 rscproxy_2.0-5

loaded via a namespace (and not attached):
 [1] colorspace_1.1-1   dichromat_1.2-4digest_0.5.2
 [4] ggplot2_0.9.2.1grid_2.15.1gtable_0.1.1
 [7] labeling_0.1   MASS_7.3-18memoise_0.1
[10] munsell_0.3plyr_1.7.1 proto_0.3-9.2
[13] RColorBrewer_1.0-5 reshape2_1.2.1 scales_0.2.2
[16] stringr_0.6.1  tools_2.15.1


It's kind of very easy in SQL but I love doing all the work in R so don't
want to leave for just changing the name.

Best Regards,

Bhupendrasinh Thakre








Best Regards,


Bhupendrasinh Thakre

*Disclaimer :*

The information contained in this communication is confidential and may be
legally privileged. It is intended solely for the use of the individual or
entity to whom it is adressed. If you are not the intended recipient you
are hereby (a) notified that any disclosure, copying, distribution or
taking any action with respect to the content of this information is
strictly prohibited and may be unlawful, and (b) kindly requested to inform
the sender immediately and destroy any copies.



On Fri, Sep 28, 2012 at 10:13 AM, Bhupendrasinh Thakre 
vickytha...@gmail.com wrote:

 Many thanks Dr. Winsemius , Kimmo and Pascal
 All of them are working and really beautiful...

 Best Regards,


 Bhupendrasinh Thakre

 *Disclaimer :*

 The information contained in this communication is confidential and may be
 legally privileged. It is intended solely for the use of the individual or
 entity to whom it is adressed. If you are not the intended recipient you
 are hereby (a) notified that any disclosure, copying, distribution or
 taking any action with respect to the content of this information is
 strictly prohibited and may be unlawful, and (b) kindly requested to inform
 the sender immediately and destroy any copies.



 On Fri, Sep 28, 2012 at 1:36 AM, David Winsemius 
 dwinsem...@comcast.netwrote:


 On Sep 27, 2012, at 11:13 PM, Bhupendrasinh Thakre wrote:

 
  Hi Everyone,
 
  I am trying a very simple task to append the Timestamp with a variable
 name so something like
  a_2012_09_27_00_12_30 - rnorm(1,2,1).

 If you want to assign a value to a character-name you need to use ...
 `assign`. You cannot just stick a numeric value which is what you get with
 sys.Time() on the LHS of a - and expect R to intuit what you intend.

 ?assign
 assign( a_2012_09_27_00_12_30 ,  rnorm(1,2,1) )
 assign( as.character(unclass(Sys.time())) ,  rnorm(1,2,1) )

 (I would have thought you wanted to format that sys.Time result:)

  format(Sys.time(), %Y_%m_%d_%H_%M_%S)
 [1] 2012_09_27_23_32_40

   assign(format(Sys.time(), %Y_%m_%d_%H_%M_%S),  rnorm(1,2,1) )
  grep(^2012, ls(), value=TRUE)
 [1] 2012_09_27_23_33_45


 
  Tried some commands but it doesn't work out well. Hope someone has some
 answer on it.
 
  Session Info
 
  R version 2.15.1 (2012-06-22)
  Platform: i386-apple-darwin9.8.0/i386 (32-bit)
 
  locale:
  [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
 
  attached base packages:
  [1] stats graphics  grDevices utils datasets  methods   base
 
  other attached packages:
  [1] chron_2.3-42twitteR_0.99.19 rjson_0.2.9 RCurl_1.91-1
  bitops_1.0-4.1  tm_0.5-7.1  RMySQL_0.9-3DBI_0.2-5
 
  loaded via a namespace (and not attached):
  [1] slam_0.1-24  tools_2.15.1
 
  Statement I tried :
 
  b - unclass(Sys.time())
  b = 1348812597
  c_b - rnorm(1,2,1)
 
  Works perfect but doesn't show me c_1348812597.
 
  Best Regards,
 
 
  Bhupendrasinh Thakre
[[alternative HTML version deleted]]

 BT; Please learn to post in plain text. It's really very simple with
 gmail.

 --

 David Winsemius, MD
 Alameda, CA, USA




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice bwplot(): Conditioning on one factor

2012-09-28 Thread Bert Gunter
A small reproducible example, as requested bythe posting guide, would
have been very helpful here (if you provide one, use ?dput to provide
the data). You have also not told us what you mean by unsuccessful,
so we are left to guess what sort of problems you experienced.  None
work is completely useless to help diagnose the problem. This means
we waste time going back and forth trying to elucidate what you mean.
Please consider these things if/when you post in future.

In any case, my guess is that param is numeric and it should be a
factor, so, e.g.

 bwplot(quant ~ factor(param) | era, data=mg.d, main='Dissolved
Magnesium', ylab='Concentration (mg/L)')

might be what you want. But of course, it may be completely wrong.

Cheers,
Bert





On Fri, Sep 28, 2012 at 9:25 AM, David Winsemius dwinsem...@comcast.net wrote:

 On Sep 28, 2012, at 7:49 AM, Rich Shepard wrote:

  I'm not able to create the proper syntax to specify a lattice bwplot() for
 only one of two conditioning factors.

 Wouldn't that involve specifying the 'subset' parameter (if bwplot accepts a 
 subset argument) or using the 'subset' function to pass the desired rows to 
 the data argument if it doesn't?


  The syntax that produces a box plot of each of the two conditioning
 factors is:

 bwplot(quant ~ param | era, data=mg.d, main='Dissolved Magnesium', 
 ylab='Concentration (mg/L)')

  What I've tried unsuccessfully are:

 bwplot(quant ~ param | factor(era=='Pre-mining'), data=mg.d,
 main='Magnesium', ylab='Concentration (mg/L))

 bwplot(quant ~ param | era, data=mg.d, main='Magnesium', ylab='Concentration
 (mg/L)', subset=era('Pre-mining'))

 plus slight variations of the above. None work.

  Please point me to what I've missed in specifying only one of two
 conditioning factors for the plot.


 --

 David Winsemius, MD
 Alameda, CA, USA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple Question

2012-09-28 Thread Bhupendrasinh Thakre
Hi Everyone,

Sorry for coming back again with a new problem.
Editing question, session info and data so you don't have to scroll till
the end of page.

*Situation :*

I have a data frame and it's name is df. Now I want to add Time Stamp to
the end of *name of data Frame i.e. df_system_time*. Previously it
was running great and thanks to Dr. Winsemius , Kimmo and Pascal and I
believe as the function which i used was scalar.

*Data :*

dput(df)structure(list(x = 1:10, y = 1:10), .Names = c(x, y),
row.names = c(NA,
-10L), class = data.frame)

*Session Info :*

R version 2.15.1 (2012-06-22)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices datasets  utils methods
[7] base

other attached packages:
[1] rcom_2.2-5 rscproxy_2.0-5

loaded via a namespace (and not attached):
 [1] colorspace_1.1-1   dichromat_1.2-4digest_0.5.2
 [4] ggplot2_0.9.2.1grid_2.15.1gtable_0.1.1
 [7] labeling_0.1   MASS_7.3-18memoise_0.1
[10] munsell_0.3plyr_1.7.1 proto_0.3-9.2
[13] RColorBrewer_1.0-5 reshape2_1.2.1 scales_0.2.2
[16] stringr_0.6.1  tools_2.15.1


It's kind of very easy in SQL but I love doing all the work in R so don't
want to leave for just changing the name.

Best Regards,

Bhupendrasinh Thakre








Best Regards,


Bhupendrasinh Thakre

*Disclaimer :*

The information contained in this communication is confidential and may be
legally privileged. It is intended solely for the use of the individual or
entity to whom it is adressed. If you are not the intended recipient you
are hereby (a) notified that any disclosure, copying, distribution or
taking any action with respect to the content of this information is
strictly prohibited and may be unlawful, and (b) kindly requested to inform
the sender immediately and destroy any copies.



On Fri, Sep 28, 2012 at 10:13 AM, Bhupendrasinh Thakre 
vickytha...@gmail.com wrote:

 Many thanks Dr. Winsemius , Kimmo and Pascal
 All of them are working and really beautiful...

 Best Regards,


 Bhupendrasinh Thakre

 *Disclaimer :*

 The information contained in this communication is confidential and may be
 legally privileged. It is intended solely for the use of the individual or
 entity to whom it is adressed. If you are not the intended recipient you
 are hereby (a) notified that any disclosure, copying, distribution or
 taking any action with respect to the content of this information is
 strictly prohibited and may be unlawful, and (b) kindly requested to inform
 the sender immediately and destroy any copies.



 On Fri, Sep 28, 2012 at 1:36 AM, David Winsemius 
 dwinsem...@comcast.netwrote:


 On Sep 27, 2012, at 11:13 PM, Bhupendrasinh Thakre wrote:

 
  Hi Everyone,
 
  I am trying a very simple task to append the Timestamp with a variable
 name so something like
  a_2012_09_27_00_12_30 - rnorm(1,2,1).

 If you want to assign a value to a character-name you need to use ...
 `assign`. You cannot just stick a numeric value which is what you get with
 sys.Time() on the LHS of a - and expect R to intuit what you intend.

 ?assign
 assign( a_2012_09_27_00_12_30 ,  rnorm(1,2,1) )
 assign( as.character(unclass(Sys.time())) ,  rnorm(1,2,1) )

 (I would have thought you wanted to format that sys.Time result:)

  format(Sys.time(), %Y_%m_%d_%H_%M_%S)
 [1] 2012_09_27_23_32_40

   assign(format(Sys.time(), %Y_%m_%d_%H_%M_%S),  rnorm(1,2,1) )
  grep(^2012, ls(), value=TRUE)
 [1] 2012_09_27_23_33_45


 
  Tried some commands but it doesn't work out well. Hope someone has some
 answer on it.
 
  Session Info
 
  R version 2.15.1 (2012-06-22)
  Platform: i386-apple-darwin9.8.0/i386 (32-bit)
 
  locale:
  [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
 
  attached base packages:
  [1] stats graphics  grDevices utils datasets  methods   base
 
  other attached packages:
  [1] chron_2.3-42twitteR_0.99.19 rjson_0.2.9 RCurl_1.91-1
  bitops_1.0-4.1  tm_0.5-7.1  RMySQL_0.9-3DBI_0.2-5
 
  loaded via a namespace (and not attached):
  [1] slam_0.1-24  tools_2.15.1
 
  Statement I tried :
 
  b - unclass(Sys.time())
  b = 1348812597
  c_b - rnorm(1,2,1)
 
  Works perfect but doesn't show me c_1348812597.
 
  Best Regards,
 
 
  Bhupendrasinh Thakre
[[alternative HTML version deleted]]

 BT; Please learn to post in plain text. It's really very simple with
 gmail.

 --

 David Winsemius, MD
 Alameda, CA, USA




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] max summary contradict each other

2012-09-28 Thread arun
Hi,
Try this:
summary(x,digits=max(5))
#   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  #  1.0  6901.5 13802.0 13802.0 20702.0 27603.0 
A.K.




- Original Message -
From: Sam Steingold s...@gnu.org
To: r-help@r-project.org
Cc: 
Sent: Friday, September 28, 2012 12:14 PM
Subject: [R] max  summary contradict each other

why does summary report max 27600 and not 27603?

 x - c(27603, 1)
 max(x)
[1] 27603
 summary(x)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      1    6902   13800   13800   20700   27600 

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://memri.org http://pmw.org.il
http://dhimmi.com http://iris.org.il http://mideasttruth.com
Vegetarians eat Vegetables, Humanitarians are scary.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice bwplot(): Conditioning on one factor

2012-09-28 Thread Rich Shepard

On Fri, 28 Sep 2012, David Winsemius wrote:


Wouldn't that involve specifying the 'subset' parameter (if bwplot accepts
a subset argument) or using the 'subset' function to pass the desired rows
to the data argument if it doesn't?


David,

  That's what I tried:


bwplot(quant ~ param | era, data=mg.d, main='Magnesium', ylab='Concentration
(mg/L)', subset=era('Pre-mining'))


  Perhaps I didn't write it correctly.

Thanks,

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple Question

2012-09-28 Thread Berend Hasselman

On 28-09-2012, at 18:40, Bhupendrasinh Thakre vickytha...@gmail.com wrote:

 Hi Everyone,
 
 Sorry for coming back again with a new problem.
 Editing question, session info and data so you don't have to scroll till
 the end of page.
 
 *Situation :*
 
 I have a data frame and it's name is df. Now I want to add Time Stamp to
 the end of *name of data Frame i.e. df_system_time*. Previously it
 was running great and thanks to Dr. Winsemius , Kimmo and Pascal and I
 believe as the function which i used was scalar.
 
 *Data :*
 
 dput(df)structure(list(x = 1:10, y = 1:10), .Names = c(x, y),
 row.names = c(NA,
 -10L), class = data.frame)
 

You have been given the answer.
It only needs a minor variation:

newname.df - paste0(df_, format(Sys.time(), %Y_%m_%d_%H_%M_%S) )
assign(newname.df,df)

and if you wish

rm(list=c('df','newname.df'))

Or install package memisc (found by doing findFn(rename) from package sos) 
and use function rename(0; I have not tried this.

Berend
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-hpc] Quickest way to make a large empty file on disk?

2012-09-28 Thread Jonathan Greenberg
Rui:

Quick follow-up -- it looks like seek does do what I want (I see Simon
suggested it some time ago) -- what do mean by trash your disk?  What I'm
trying to accomplish is getting parallel, asynchronous writes to a large
binary image (just a binary file) working.  Each node writes to a different
sector of the file via mmap, filling in the values as the process runs,
but the file needs to be pre-created before I can mmap it.  Running a
writeBin with a bunch of 0s would mean I'd basically have to write the file
twice, but the seek/ff trick seems to be much faster.

Do I risk doing some damage to my filesystem if I use seek?  I see there is
a strongly worded warning in the help for ?seek:

Use of seek on Windows is discouraged. We have found so many errors in the
Windows implementation of file positioning that users are advised to use it
only at their own risk, and asked not to waste the *R* developers' time
with bug reports on Windows' deficiencies. -- there's no detail here on
which errors people have experienced, so I'm not sure if doing something as
simple as just creating a file using seek falls under the discouraging
category.

As a note, we are trying to work this up on both Windows and *nix systems,
hence our wanting to have a single approach that works on both OSs.

--j


On Thu, Sep 27, 2012 at 3:49 PM, Rui Barradas ruipbarra...@sapo.pt wrote:

  Hello,

 If you really need to trash your disk, why not use seek()?

  fl - file(Test.txt, open = wb)
  seek(fl, where = 1024, origin = start, rw = write)
 [1] 0
  writeChar(character(1), fl, nchars = 1, useBytes = TRUE)
 Warning message:
 In writeChar(character(1), fl, nchars = 1, useBytes = TRUE) :
   writeChar: more characters requested than are in the string - will
 zero-pad
  close(fl)


 File Test.txt is now 1Kb in size.

 Hope this helps,

 Rui Barradas
 Em 27-09-2012 20:17, Jonathan Greenberg escreveu:

 Folks:

 Asked this question some time ago, and found what appeared (at first) to be
 the best solution, but I'm now finding a new problem.  First off, it seemed
 like ff as Jens suggested worked:

 # outdata_ncells = the number of rows * number of columns * number of bands
 in an image:
 out-ff(vmode=double,length=outdata_ncells,filename=filename)
 finalizer(out) - close
 close(out)

 This was working fine until I attempted to set length to a VERY large
 number: outdata_ncells = 17711913600.  This would create a file that is
 131.964GB.  Big, but not obscenely so (and certainly not larger than the
 filesystem can handle).  However, length appears to be restricted
 by .Machine$integer.max (I'm on a 64-bit windows box):

  .Machine$integer.max

  [1] 2147483647

 Any suggestions on how to solve this problem for much larger file sizes?

 --j


 On Thu, May 3, 2012 at 10:44 AM, Jonathan Greenberg j...@illinois.edu 
 j...@illinois.eduwrote:


  Thanks, all!  I'll try these out.  I'm trying to work up something that is
 platform independent (if possible) for use with mmap.  I'll do some tests
 on these suggestions and see which works best. I'll try to report back in a
 few days.  Cheers!

 --j



 2012/5/3 Jens Oehlschlägel jens.oehlschlae...@truecluster.com 
 jens.oehlschlae...@truecluster.com

  Jonathan,

 On some filesystems (e.g. NTFS, see below) it is possible to create
 'sparse' memory-mapped files, i.e. reserving the space without the cost of
 actually writing initial values.
 Package 'ff' does this automatically and also allows to access the file
 in parallel. Check the example below and see how big file creation is
 immediate.

 Jens Oehlschlägel



  library(ff)
 library(snowfall)
 ncpus - 2
 n - 1e8
 system.time(

  + x - ff(vmode=double, length=n, filename=c:/Temp/x.ff)
 + )
User  System verstrichen
0.010.000.02

  # check finalizer, with an explicit filename we should have a 'close'

  finalizer

  finalizer(x)

  [1] close

  # if not, set it to 'close' inorder to not let slaves delete x on slave

  shutdown

  finalizer(x) - close
 sfInit(parallel=TRUE, cpus=ncpus, type=SOCK)

  R Version:  R version 2.15.0 (2012-03-30)

 snowfall 1.84 initialized (using snow 0.3-9): parallel execution on 2
 CPUs.


  sfLibrary(ff)

  Library ff loaded.
 Library ff loaded in cluster.

 Warnmeldung:
 In library(package = ff, character.only = TRUE, pos = 2, warn.conflicts
 = TRUE,  :
   'keep.source' is deprecated and will be ignored

  sfExport(x) # note: do not export the same ff multiple times
 # explicitely opening avoids a gc problem
 sfClusterEval(open(x, caching=mmeachflush)) # opening with

  'mmeachflush' inststead of 'mmnoflush' is a bit slower but prevents OS
 write storms when the file is larger than RAM
 [[1]]
 [1] TRUE

 [[2]]
 [1] TRUE


  system.time(

  + sfLapply( chunk(x, length=ncpus), function(i){
 +   x[i] - runif(sum(i))
 +   invisible()
 + })
 + )
User  System verstrichen
0.000.00   30.78

  system.time(

  + s - sfLapply( chunk(x, length=ncpus), function(i) 

Re: [R] Lattice bwplot(): Conditioning on one factor

2012-09-28 Thread David Winsemius

On Sep 28, 2012, at 9:56 AM, Rich Shepard wrote:

 On Fri, 28 Sep 2012, David Winsemius wrote:
 
 Wouldn't that involve specifying the 'subset' parameter (if bwplot accepts
 a subset argument) or using the 'subset' function to pass the desired rows
 to the data argument if it doesn't?
 
 David,
 
  That's what I tried:
 
 bwplot(quant ~ param | era, data=mg.d, main='Magnesium', ylab='Concentration
 (mg/L)', subset=era('Pre-mining'))

Sigh. If I were testing that strategy (which I did not try because you were too 
busy to have included a working example)  I would have written it:

bwplot(quant ~ param , data=mg.d, main='Magnesium', ylab='Concentration
(mg/L)', subset= era=='Pre-mining' )

That passes a logical vector which will work only if bwplot created an local 
environment where column names of the 'data' argument have been added to the 
local namespce. I do not know if that is true. I just looked at the bwplot help 
page and do not see a subset argument documented there.

The other suggestion which it seems you were also to busy too have tried was:

bwplot(quant ~ param ,  main='Magnesium', ylab='Concentration
(mg/L)', data = subset( mg.dsubset,  era=='Pre-mining' ) )

Wrapping a column name around a factor level with parentheses (which R takes to 
mean there is a function named 'era' to be applied)  and expecting R to 
understand the you want a subset seems doomed to failure.

It makes no sense to me to condition on a factor that you know for certainty 
has only one level in the data being offered.
--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice bwplot(): Conditioning on one factor

2012-09-28 Thread Rich Shepard

On Fri, 28 Sep 2012, David Winsemius wrote:


bwplot(quant ~ param , data=mg.d, main='Magnesium', ylab='Concentration
(mg/L)', subset= era=='Pre-mining' )


David, Don:

  Thank you. I tried subset= and era== separately, not together.

  Now I know.

Much appreciated,

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice bwplot(): Conditioning on one factor

2012-09-28 Thread Bert Gunter
Yes. Now I understand what was wanted.

1. the subset argument is certainly documented on the Help page:

subset  

An expression that evaluates to a logical or integer indexing vector.
Like groups, it is evaluated in data. Only the resulting rows of data
are used for the plot. If subscripts is TRUE, the subscripts provided
to the panel function will be indices referring to the rows of data
prior to the subsetting. Whether levels of factors in the data frame
that are unused after the subsetting will be dropped depends on the
drop.unused.levels argument.

Had the OP read this carefully, he would have presumably recognized
the errors in his specification.

2. Here is a small reproducible example to show how it should be done
(probably unnecessary now):

 df -expand.grid(a = letters[1:3],b=LETTERS[1:2])
 df - df[rep(1:6,10),]
 df$y - runif(60)
 bwplot(y~a|b, dat=df,subset = (b==A))
## The logical condition is parenthesized only for clarity

Cheers,
Bert



On Fri, Sep 28, 2012 at 10:10 AM, David Winsemius
dwinsem...@comcast.net wrote:

 On Sep 28, 2012, at 9:56 AM, Rich Shepard wrote:

 On Fri, 28 Sep 2012, David Winsemius wrote:

 Wouldn't that involve specifying the 'subset' parameter (if bwplot accepts
 a subset argument) or using the 'subset' function to pass the desired rows
 to the data argument if it doesn't?

 David,

  That's what I tried:

 bwplot(quant ~ param | era, data=mg.d, main='Magnesium', 
 ylab='Concentration
 (mg/L)', subset=era('Pre-mining'))

 Sigh. If I were testing that strategy (which I did not try because you were 
 too busy to have included a working example)  I would have written it:

 bwplot(quant ~ param , data=mg.d, main='Magnesium', ylab='Concentration
 (mg/L)', subset= era=='Pre-mining' )

 That passes a logical vector which will work only if bwplot created an 
 local environment where column names of the 'data' argument have been added 
 to the local namespce. I do not know if that is true. I just looked at the 
 bwplot help page and do not see a subset argument documented there.

 The other suggestion which it seems you were also to busy too have tried was:

 bwplot(quant ~ param ,  main='Magnesium', ylab='Concentration
 (mg/L)', data = subset( mg.dsubset,  era=='Pre-mining' ) )

 Wrapping a column name around a factor level with parentheses (which R takes 
 to mean there is a function named 'era' to be applied)  and expecting R to 
 understand the you want a subset seems doomed to failure.

 It makes no sense to me to condition on a factor that you know for certainty 
 has only one level in the data being offered.
 --

 David Winsemius, MD
 Alameda, CA, USA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] install.packages on windows

2012-09-28 Thread Uwe Ligges



On 28.09.2012 00:32, Duncan Murdoch wrote:

On 12-09-27 2:53 PM, Anju R wrote:

Sometimes when I try to install certain packages I get a warning message.
For example, I tried to install the package Imtest on windows R version
2.15.1 and got the following message:

Warning message:
package ‘Imtest’ is not available (for R version 2.15.1)

How can I install the above package? Why do I get the above Warning
message?


It probably means exactly what it says, except that the information is
about the mirror you are using.

I would try another mirror.  If that doesn't solve it, then it probably
means that the package is really not available for 2.15.1.

You can look on the cran.r-project.org website for information about it,
and probably download the source from there, but you will probably need
to fix whatever is wrong with it before it will work.



Or in other words:

There is no such package Imtest on CRAN, perhaps you are looking for 
lmtest?


Uwe Ligges




Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-28 Thread arun
Hi,

As I mentioned earlier, these are just guess work until you provide a subset of 
your data with dput().  Also, please check the structure of the data with str().

A.K.  






- Original Message -
From: Landi ent-ar...@gmx.de
To: r-help@r-project.org
Cc: 
Sent: Friday, September 28, 2012 10:35 AM
Subject: Re: [R] Anova and tukey-grouping

Hello !

Thanks for your advice. I tried it, but the output is the same:
 HSD.test(anova.typabunmit, typ, group=TRUE)
Name:  typ 
ds.typabunmit$typ 

I don't get the values...!?!?



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644513.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to write R package

2012-09-28 Thread Uwe Ligges



On 28.09.2012 14:22, Duncan Murdoch wrote:

On 27/09/2012 5:15 PM, Dr. Alireza Zolfaghari wrote:

Hi List,
Would you please send me a good link to talk me through on how to
write a R
package?



See the ?package.skeleton help page.  After you have run it, follow the
instructions in the Read-and-delete-me file that it will create.

For full details, see the Writing R Extensions manual.

For modifying the package after you've finished the Read-and-delete-me
instructions, just manually add *.R files where the rest of them are,
and use the prompt() function to produce skeleton documentation.

That's about it, but you can read more if you like in a tutorial I gave
a few years ago at a UseR meeting in Dortmund:

http://www.statistik.uni-dortmund.de/useR-2008/slides/Murdoch.pdf




... and there are others who gave talks or tutorials about it (inlcuding 
myself).


Nevertheless, I'd recommend to look into the manual Writing R 
Extensions which is updated with R and with the changes in the package 
related mechanisms --- while all our talks and tutorials won't get 
updated. Probably Duncan's is still correct, but I want to make this 
remark for the list's archives.


Best,
Uwe Ligges







Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple Question

2012-09-28 Thread Bhupendrasinh Thakre
Thanks a ton Berend. That worked like a charm..
R comes with thousands of Sweet Surprises everyday


Bhupendrasinh Thakre




On Sep 28, 2012, at 12:00 PM, Berend Hasselman b...@xs4all.nl wrote:

 
 On 28-09-2012, at 18:40, Bhupendrasinh Thakre vickytha...@gmail.com wrote:
 
 Hi Everyone,
 
 Sorry for coming back again with a new problem.
 Editing question, session info and data so you don't have to scroll till
 the end of page.
 
 *Situation :*
 
 I have a data frame and it's name is df. Now I want to add Time Stamp to
 the end of *name of data Frame i.e. df_system_time*. Previously it
 was running great and thanks to Dr. Winsemius , Kimmo and Pascal and I
 believe as the function which i used was scalar.
 
 *Data :*
 
 dput(df)structure(list(x = 1:10, y = 1:10), .Names = c(x, y),
 row.names = c(NA,
 -10L), class = data.frame)
 
 
 You have been given the answer.
 It only needs a minor variation:
 
 newname.df - paste0(df_, format(Sys.time(), %Y_%m_%d_%H_%M_%S) )
 assign(newname.df,df)
 
 and if you wish
 
 rm(list=c('df','newname.df'))
 
 Or install package memisc (found by doing findFn(rename) from package sos) 
 and use function rename(0; I have not tried this.
 
 Berend


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple Question

2012-09-28 Thread Bert Gunter
On Fri, Sep 28, 2012 at 11:15 AM, Bhupendrasinh Thakre
vickytha...@gmail.com wrote:
 Thanks a ton Berend. That worked like a charm..
 R comes with thousands of Sweet Surprises everyday

-- Not for those who read the docs. :-o

-- Bert



 Bhupendrasinh Thakre




 On Sep 28, 2012, at 12:00 PM, Berend Hasselman b...@xs4all.nl wrote:


 On 28-09-2012, at 18:40, Bhupendrasinh Thakre vickytha...@gmail.com wrote:

 Hi Everyone,

 Sorry for coming back again with a new problem.
 Editing question, session info and data so you don't have to scroll till
 the end of page.

 *Situation :*

 I have a data frame and it's name is df. Now I want to add Time Stamp to
 the end of *name of data Frame i.e. df_system_time*. Previously it
 was running great and thanks to Dr. Winsemius , Kimmo and Pascal and I
 believe as the function which i used was scalar.

 *Data :*

 dput(df)structure(list(x = 1:10, y = 1:10), .Names = c(x, y),
 row.names = c(NA,
 -10L), class = data.frame)


 You have been given the answer.
 It only needs a minor variation:

 newname.df - paste0(df_, format(Sys.time(), %Y_%m_%d_%H_%M_%S) )
 assign(newname.df,df)

 and if you wish

 rm(list=c('df','newname.df'))

 Or install package memisc (found by doing findFn(rename) from package sos) 
 and use function rename(0; I have not tried this.

 Berend


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to test if there is a subvector in a longer vector

2012-09-28 Thread Atte Tenkanen
Thank you!
___
Lähettäjä: Berend Hasselman [b...@xs4all.nl]
Lähetetty: 28. syyskuuta 2012 10:47
Vastaanottaja: Atte Tenkanen
Cc: R help
Aihe: Re: [R] How to test if there is a subvector in a longer vector

On 28-09-2012, at 07:41, Atte Tenkanen atte...@utu.fi wrote:

 Sorry. I should have mentioned that the order of the components is important.

 So c(1,4,6) is accepted as a subvector of c(2,1,1,4,6,3), but not of 
 c(2,1,1,6,4,3).

 How to test this?

See this discussion for a variety of solutions.

http://r.789695.n4.nabble.com/matching-a-sequence-in-a-vector-td4389523.html#a4393453

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Arules - predict function issues - subscript out of bounds

2012-09-28 Thread alicechao
Hi Ankur, 

I am running into the exact same issue you have described above. Were you
able to find out why it didn't work on your data set and resolve it? If yes,
could you share? 

Much thanks  regards,
Alice 



--
View this message in context: 
http://r.789695.n4.nabble.com/Arules-predict-function-issues-subscript-out-of-bounds-tp4634422p4644546.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-hpc] Quickest way to make a large empty file on disk?

2012-09-28 Thread Rui Barradas

Hello,

I've written a function to try to answer to your op request, but I've 
run into a problem. See in the end.

In the mean time, inline.
Em 28-09-2012 17:44, Jonathan Greenberg escreveu:

Rui:

Quick follow-up -- it looks like seek does do what I want (I see Simon
suggested it some time ago) -- what do mean by trash your disk?
Nothing special, just that sometimes there are good ways of doing so. 
mmap seems to be safe.

   What I'm
trying to accomplish is getting parallel, asynchronous writes to a large
binary image (just a binary file) working.  Each node writes to a different
sector of the file via mmap, filling in the values as the process runs,
but the file needs to be pre-created before I can mmap it.  Running a
writeBin with a bunch of 0s would mean I'd basically have to write the file
twice, but the seek/ff trick seems to be much faster.

Do I risk doing some damage to my filesystem if I use seek?  I see there is
a strongly worded warning in the help for ?seek:

Use of seek on Windows is discouraged. We have found so many errors in the
Windows implementation of file positioning that users are advised to use it
only at their own risk, and asked not to waste the *R* developers' time
with bug reports on Windows' deficiencies. -- there's no detail here on
which errors people have experienced, so I'm not sure if doing something as
simple as just creating a file using seek falls under the discouraging
category.


I'm not a great system programmer but in 20+ years of using seek on 
Windows has shown nothing of the sort. In fact, I've just found a 
problem with ubuntu 12.04, where seek gives the expected result on 
Windows, it goes up to a certain point on ubuntu and then stops 
seeking, or whatever is happening. I installed ubuntu very recently so 
I really don't know why the behavior that you can see in the example run 
below. But I do that Windows 7 is causing no problem, as expected.

As a note, we are trying to work this up on both Windows and *nix systems,
hence our wanting to have a single approach that works on both OSs.

--j


#
# Function: creates a file of ascii nulls using seek/writeBin. File size 
can be big.

#
createBig - function(filename, size){
if(size == 0) return(0)
chunk - .Machine$integer.max
nchunks - as.integer(size / chunk)
rest - size - as.double(nchunks)*as.double(chunk)
fl - file(filename, open = wb)
for(i in seq_len(nchunks)){
seek(fl, where = chunk - 1, origin = current, rw = write)
writeBin(raw(1), fl)
# -- debug --
print(seek(fl, where = NA))
}
if(rest  0){
seek(fl, where = rest - 1, origin = current, rw = write)
writeBin(raw(1), fl)
}
close(fl)
}

As you can see from the debug prints, on Windows 7,  everything works as 
planned while on ubuntu 12.04 when it reaches 17Gb seek stops seeking. 
The increments in file size become 1 byte at a time, explained by the 
writeBin instruction. (The different, slightly larger, size is 
irrelevant, the code was ran several times all with the same result:  at 
17179869176 bytes it no longer works.)


#
#
# System: Windows 7 / R 2.15.1

size - 10*.Machine$integer.max + sample(.Machine$integer.max, 1)
size
[1] 22195364413

createBig(Test.txt, size)
[1] 2147483647
[1] 4294967294
[1] 6442450941
[1] 8589934588
[1] 10737418235
[1] 12884901882
[1] 15032385529
[1] 17179869176
[1] 19327352823
[1] 21474836470

file.info(Test.txt)$size
[1] 22195364413
file.info(Test.txt)$size %/% .Machine$integer.max
[1] 10
file.info(Test.txt)$size %% .Machine$integer.max
[1] 720527943

sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=Portuguese_Portugal.1252 LC_CTYPE=Portuguese_Portugal.1252
[3] LC_MONETARY=Portuguese_Portugal.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Portugal.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base

loaded via a namespace (and not attached):
[1] fortunes_1.5-0

#
#
# System: ubuntu 12.04 precise pangolim / R 2.15.1
size - 10*.Machine$integer.max + sample(.Machine$integer.max, 1)
size
[1] 23091487381

createBig(Test.txt, size)
[1] 2147483647
[1] 4294967294
[1] 6442450941
[1] 8589934588
[1] 10737418235
[1] 12884901882
[1] 15032385529
[1] 17179869176
[1] 17179869177
[1] 17179869178

file.info(Test.txt)$size
[1] 17179869179
file.info(Test.txt)$size %/% .Machine$integer.max
[1] 8
file.info(Test.txt)$size %% .Machine$integer.max
[1] 3


sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=pt_PT.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=pt_PT.UTF-8LC_COLLATE=pt_PT.UTF-8
 [5] LC_MONETARY=pt_PT.UTF-8LC_MESSAGES=pt_PT.UTF-8
 [7] LC_PAPER=C LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] 

Re: [R] changing outlier shapes of boxplots using lattice

2012-09-28 Thread ilai
On Fri, Sep 28, 2012 at 6:57 AM, Richard M. Heiberger r...@temple.eduwrote:

 Elaine,

 For panel.bwplot you see that the central dot and the outlier dots are
 controlled by
 the same pch argument.


??? I don't think so...

bwplot(rgamma(20,.1,1)~gl(2,10), pch=rep(17,2),
panel = lattice::panel.bwplot)

I think you mean panel.bwplot.intermidiate.hh ?

BTW thank you for the useful HH package but in this case OP is using it
with no at argument, so why not

Diet.colors - c(forestgreen, darkgreen,chocolate1,darkorange2,
sienna2,red2,firebrick3,saddlebrown,coral4,chocolate4,darkblue,navy,grey38)
 bwplot(rgamma(20*13,1,.1)~gl(13,20),
  fill = Diet.colors, pch = |,
  par.settings = list(box.umbrella=list(lty=1)))

cheers



I initially set the pch=| to match your first
 example with the horizontal
 indicator for the median.  I would be inclined to use the default circle
 for the outliers and
 therefore also for the median.

 Rich

 On Fri, Sep 28, 2012 at 7:13 AM, Sarah Goslee sarah.gos...@gmail.com
 wrote:

  I would guess that if you find the bit that says pch=| and change it to
  pch=1 it will solve your question, and that reading ?par will tell you
 why.
 
  Sarah
 
  On Thursday, September 27, 2012, Elaine Kuo wrote:
 
   Hello
  
   This is Elaine.
  
   I am using package lattice to generate boxplots.
   Using Richard's code, the display was almost perfect except the outlier
   shape.
   Based on the following code, the outliers are vertical lines.
   However, I want the outliers to be empty circles.
   Please kindly help how to modify the code to change the outlier shapes.
   Thank you.
  
   code
   package (lattice)
  
   dataN - data.frame(GE_distance=rnorm(260),
  
   Diet_B=factor(rep(1:13, each=20)))
  
   Diet.colors - c(forestgreen, darkgreen,chocolate1,darkorange2,
  
sienna2,red2,firebrick3,saddlebrown,coral4,
  
chocolate4,darkblue,navy,grey38)
  
   levels(dataN$Diet_B) - Diet.colors
  
   bwplot(GE_distance ~ Diet_B, data=dataN,
  
  xlab=list(Diet of Breeding Ground, cex = 1.4),
  
  ylab = list(
  
Distance between Centers of B and NB Range (1000 km),
  
cex = 1.4),
  
  panel=panel.bwplot.intermediate.hh,
  
  col=Diet.colors,
  
  pch=rep(|,13),
  
  scales=list(x=list(rot=90)),
  
  par.settings=list(box.umbrella=list(lty=1)))
  
   [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org javascript:; mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
  --
  Sarah Goslee
  http://www.stringpage.com
  http://www.sarahgoslee.com
  http://www.functionaldiversity.org
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Better way of Grouping?

2012-09-28 Thread Charles Determan Jr
Hello R users,

This is more of a convenience question that I hope others might find useful
if there is a better answer.  I work with large datasets that requires
multiple parsing stages for different analysis.  For example, compare group
3 vs. group 4.  A more complicated comparison would be time B in group 3 of
group L with B in group 4 of group L.  I normally subset each group with
the following type of code.

data=read(...)

#L v D
L=data[LvD %in% c(L),]
D=data[LvD %in% c(D),]

#Groups 3 and 4 within L and D
group3L=L[group %in% c(3),]
group4L=L[group %in% c(3),]

group3D=D[group %in% c(3),]
group4D=D[group %in% c(3),]

#Times B, S45, FR2, FR8
you get the idea


Is there a more efficient way to subset groups?  Thanks for any insight.

Regards,
Charles

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-hpc] Quickest way to make a large empty file on disk?

2012-09-28 Thread Simon Urbanek

On Sep 28, 2012, at 12:44 PM, Jonathan Greenberg wrote:

 Rui:
 
 Quick follow-up -- it looks like seek does do what I want (I see Simon
 suggested it some time ago) -- what do mean by trash your disk?  

I can't speak for Rui, but the difference between seeking and explicit write is 
that the FS can optimize the former by not actually writing anything to disk 
(which is why it's so fast on some OS/FS combos). However, what this means that 
the layout on the disk may not be sequential depending on the write patterns of 
the actual data blocks, because the FS may keep a mask of unused blocks and 
don't write them. But that is just a FS issue and thus varies vasty by OS and 
FS. For your use this probably doesn't matter as you probably don't need to 
stream the resulting file at the end.


 What I'm
 trying to accomplish is getting parallel, asynchronous writes to a large
 binary image (just a binary file) working.  Each node writes to a different
 sector of the file via mmap, filling in the values as the process runs,
 but the file needs to be pre-created before I can mmap it.  Running a
 writeBin with a bunch of 0s would mean I'd basically have to write the file
 twice, but the seek/ff trick seems to be much faster.
 
 Do I risk doing some damage to my filesystem if I use seek?  I see there is
 a strongly worded warning in the help for ?seek:
 
 Use of seek on Windows is discouraged. We have found so many errors in the
 Windows implementation of file positioning that users are advised to use it
 only at their own risk, and asked not to waste the *R* developers' time
 with bug reports on Windows' deficiencies. -- there's no detail here on
 which errors people have experienced, so I'm not sure if doing something as
 simple as just creating a file using seek falls under the discouraging
 category.
 

Quick search in my mail shows issues that were related to what Windows reports 
as the seek location on text files when querying. AFAICS it did not affect the 
side-effect of seek which is what you're interested in.

Cheers,
Simon


 As a note, we are trying to work this up on both Windows and *nix systems,
 hence our wanting to have a single approach that works on both OSs.
 
 --j
 
 
 On Thu, Sep 27, 2012 at 3:49 PM, Rui Barradas ruipbarra...@sapo.pt wrote:
 
 Hello,
 
 If you really need to trash your disk, why not use seek()?
 
 fl - file(Test.txt, open = wb)
 seek(fl, where = 1024, origin = start, rw = write)
 [1] 0
 writeChar(character(1), fl, nchars = 1, useBytes = TRUE)
 Warning message:
 In writeChar(character(1), fl, nchars = 1, useBytes = TRUE) :
  writeChar: more characters requested than are in the string - will
 zero-pad
 close(fl)
 
 
 File Test.txt is now 1Kb in size.
 
 Hope this helps,
 
 Rui Barradas
 Em 27-09-2012 20:17, Jonathan Greenberg escreveu:
 
 Folks:
 
 Asked this question some time ago, and found what appeared (at first) to be
 the best solution, but I'm now finding a new problem.  First off, it seemed
 like ff as Jens suggested worked:
 
 # outdata_ncells = the number of rows * number of columns * number of bands
 in an image:
 out-ff(vmode=double,length=outdata_ncells,filename=filename)
 finalizer(out) - close
 close(out)
 
 This was working fine until I attempted to set length to a VERY large
 number: outdata_ncells = 17711913600.  This would create a file that is
 131.964GB.  Big, but not obscenely so (and certainly not larger than the
 filesystem can handle).  However, length appears to be restricted
 by .Machine$integer.max (I'm on a 64-bit windows box):
 
 .Machine$integer.max
 
 [1] 2147483647
 
 Any suggestions on how to solve this problem for much larger file sizes?
 
 --j
 
 
 On Thu, May 3, 2012 at 10:44 AM, Jonathan Greenberg j...@illinois.edu 
 j...@illinois.eduwrote:
 
 
 Thanks, all!  I'll try these out.  I'm trying to work up something that is
 platform independent (if possible) for use with mmap.  I'll do some tests
 on these suggestions and see which works best. I'll try to report back in a
 few days.  Cheers!
 
 --j
 
 
 
 2012/5/3 Jens Oehlschlägel jens.oehlschlae...@truecluster.com 
 jens.oehlschlae...@truecluster.com
 
 Jonathan,
 
 On some filesystems (e.g. NTFS, see below) it is possible to create
 'sparse' memory-mapped files, i.e. reserving the space without the cost of
 actually writing initial values.
 Package 'ff' does this automatically and also allows to access the file
 in parallel. Check the example below and see how big file creation is
 immediate.
 
 Jens Oehlschlägel
 
 
 
 library(ff)
 library(snowfall)
 ncpus - 2
 n - 1e8
 system.time(
 
 + x - ff(vmode=double, length=n, filename=c:/Temp/x.ff)
 + )
   User  System verstrichen
   0.010.000.02
 
 # check finalizer, with an explicit filename we should have a 'close'
 
 finalizer
 
 finalizer(x)
 
 [1] close
 
 # if not, set it to 'close' inorder to not let slaves delete x on slave
 
 shutdown
 
 finalizer(x) - close
 sfInit(parallel=TRUE, cpus=ncpus, type=SOCK)
 
 R 

[R] Select Original and Duplicates

2012-09-28 Thread Adam Gabbert
I would like to select a all the duplicate rows of a data frame including
the original.  Any help would be much appreciated.  This is where I'm at so
far. Thanks.

#Sample data frame:
df - read.table(header=T, con - textConnection('
 label value
 A 4
 B 3
 C 6
 B 3
 B 1
 A 2
 A 4
 A 4
'))
close(con)

# Duplicate entries
df[duplicated(df),]

# label value
# B 3
# A 4
# A 4

#I want to select all the rows that are duplicated including the original
#This is the output I want
# label value
# B 3
# B 3
# A 4
# A 4
# A 4

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Select Original and Duplicates

2012-09-28 Thread Rui Barradas

Hello,

Try the following.


idx - duplicated(df) | duplicated(df, fromLast = TRUE)
df[idx, ]

Note that they are returned in their original order in the df.

Hope this helps,

Rui Barradas

Em 28-09-2012 21:11, Adam Gabbert escreveu:

I would like to select a all the duplicate rows of a data frame including
the original.  Any help would be much appreciated.  This is where I'm at so
far. Thanks.

#Sample data frame:
df - read.table(header=T, con - textConnection('
  label value
  A 4
  B 3
  C 6
  B 3
  B 1
  A 2
  A 4
  A 4
'))
close(con)

# Duplicate entries
df[duplicated(df),]

# label value
# B 3
# A 4
# A 4

#I want to select all the rows that are duplicated including the original
#This is the output I want
# label value
# B 3
# B 3
# A 4
# A 4
# A 4

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Better way of Grouping?

2012-09-28 Thread Jeff Newmiller
You have not specified the objective function you are trying to optimize with 
your term efficient, or what you do with all of these subsets once you have 
them. 

For notational simplification and completeness of coverage (not necessarily 
computational speedup) you might want to look at tapply or ddply/dlply from 
the plyr package. If you build lists of subsets you can index into them 
according to grouping value. You can use expand.grid to build all permutations 
of grouping values to use as indexes into those lists of subsets.

To reiterate, you have not indicated what you want to do with these subsets, so 
there could be special-purpose functions that do what you want.  As always, 
reproducible code leads to reproducible answers. :)
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Charles Determan Jr deter...@umn.edu wrote:

Hello R users,

This is more of a convenience question that I hope others might find
useful
if there is a better answer.  I work with large datasets that requires
multiple parsing stages for different analysis.  For example, compare
group
3 vs. group 4.  A more complicated comparison would be time B in group
3 of
group L with B in group 4 of group L.  I normally subset each group
with
the following type of code.

data=read(...)

#L v D
L=data[LvD %in% c(L),]
D=data[LvD %in% c(D),]

#Groups 3 and 4 within L and D
group3L=L[group %in% c(3),]
group4L=L[group %in% c(3),]

group3D=D[group %in% c(3),]
group4D=D[group %in% c(3),]

#Times B, S45, FR2, FR8
you get the idea


Is there a more efficient way to subset groups?  Thanks for any
insight.

Regards,
Charles

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Merging multiple columns into one column

2012-09-28 Thread Meredith Ballard LaBeau
Good Evening-
 I have a dataframe that has 10 columns that has a header and 7306 rows in
each column, I want to combine these columns into one. I utilized the stack
function but it only returned 3/4 of the data...my code is:
where nfcuy_bw is the dataframe with 7305 obs. and 10 variables
Once I apply this code I only receive a data frame with 58440 obs. of 2
variables, of which there should be 73,050 obs. of 2 variables, just
wondering what is happening here?

 View(nfcuy_bw)

attach(nfcuy_bw)

cuyahoga_nf-data.frame(s5,s10,s25,s27,s33,s41,s51,his_c)

cuy_nf-stack(cuyahoga_nf)

Thanks
Meredith

-- 
Doctoral Candidate
Department of Civil and Environmental Engineering
Michigan Technological University

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Better way of Grouping?

2012-09-28 Thread David Winsemius

On Sep 28, 2012, at 11:59 AM, Charles Determan Jr wrote:

 Hello R users,
 
 This is more of a convenience question that I hope others might find useful
 if there is a better answer.  I work with large datasets that requires
 multiple parsing stages for different analysis.  For example, compare group
 3 vs. group 4.  A more complicated comparison would be time B in group 3 of
 group L with B in group 4 of group L.  I normally subset each group with
 the following type of code.
 
 data=read(...)
 
 #L v D
 L=data[LvD %in% c(L),]
 D=data[LvD %in% c(D),]
 
 #Groups 3 and 4 within L and D
 group3L=L[group %in% c(3),]
 group4L=L[group %in% c(3),]

Assume you meant to have a 4 there
 
 group3D=D[group %in% c(3),]
 group4D=D[group %in% c(3),]

Ditto. Only makes sense with a 4.



The usual way is to use:

lapply( split(data, interaction(data$LvD, data$group)) ,
 fun( subdf) {do something with subdf} )

That way you do not end up littering you workspace with subsidiary subsets of 
you main data object.


 
 #Times B, S45, FR2, FR8
 you get the idea
 
 
 Is there a more efficient way to subset groups?  Thanks for any insight.
 
-- 

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merging multiple columns into one column

2012-09-28 Thread David Winsemius

On Sep 28, 2012, at 2:51 PM, Meredith Ballard LaBeau wrote:

 Good Evening-
 I have a dataframe that has 10 columns that has a header and 7306 rows in
 each column, I want to combine these columns into one. I utilized the stack
 function but it only returned 3/4 of the data...my code is:
 where nfcuy_bw is the dataframe with 7305 obs. and 10 variables
 Once I apply this code I only receive a data frame with 58440 obs. of 2
 variables, of which there should be 73,050 obs. of 2 variables, just
 wondering what is happening here?
 
 View(nfcuy_bw)
 
 attach(nfcuy_bw)

Using 'attach' is a great way to produce confusing errors.

 
 cuyahoga_nf-data.frame(s5,s10,s25,s27,s33,s41,s51,his_c)
 
 cuy_nf-stack(cuyahoga_nf)

Unable to do much else in the absence of a dataset, much less a summary of 
these objects,  whose creation is your responsibility, not ours.

-- 

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Heatmap Colors

2012-09-28 Thread Nick Fankhauser

Hello R-Users!

I'm using a heatmap to visualize a matrix of values between -1 and 3.
How can I set the colors so that white is zero, below zero is blue of 
increasing intensity towards -1 and above zero is red of increasing 
intensity towards red?


I tried like this (using the marray and gplots packages from bioconductor):
mcol - maPalette(low=blue, mid=white, high=red,k=100)
heatmap.2(my_matrix, col=mcol)

But white does not correspond to zero, because the value distribution is 
not symmetrical, so that zero is not in the middle.
Is it somehow possible to create a color palette with white centered at 
zero?


Nick Fankhauser

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merging multiple columns into one column

2012-09-28 Thread Bert Gunter
?unlist

(A data frame is a list, as ?data.frame explains. Also the Intro to R
tutorial, which should be read by everyone beginning with R).

-- Bert

On Fri, Sep 28, 2012 at 2:51 PM, Meredith Ballard LaBeau
mmbal...@mtu.edu wrote:
 Good Evening-
  I have a dataframe that has 10 columns that has a header and 7306 rows in
 each column, I want to combine these columns into one. I utilized the stack
 function but it only returned 3/4 of the data...my code is:
 where nfcuy_bw is the dataframe with 7305 obs. and 10 variables
 Once I apply this code I only receive a data frame with 58440 obs. of 2
 variables, of which there should be 73,050 obs. of 2 variables, just
 wondering what is happening here?

  View(nfcuy_bw)

 attach(nfcuy_bw)

 cuyahoga_nf-data.frame(s5,s10,s25,s27,s33,s41,s51,his_c)

 cuy_nf-stack(cuyahoga_nf)

 Thanks
 Meredith

 --
 Doctoral Candidate
 Department of Civil and Environmental Engineering
 Michigan Technological University

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] changing outlier shapes of boxplots using lattice

2012-09-28 Thread Elaine Kuo
Hello Ilai,

Thank you for the response.
It did help a lot.

However, a beginner to lattice has three questions.

Q1

Please kindly explain why in this case OP is using it with no at
argument,
so it is possible to display the median and the outliers with different pch?

Q2.
what is the relationship between package HH and graphic-drawing?
I checked ??HH and found little explanation on its function of
graphic-drawing.

Q3

Please kindly advise how to make outliers empty circle (pch=2) in this case
as the code below.
Thank you.

code

Diet.colors -
c(forestgreen,darkgreen,chocolate1,darkorange2,sienna2,

red2,firebrick3,saddlebrown,coral4,chocolate4,darkblue,navy,grey38)

levels(dataN$Diet_B) - diet.code


bwplot(MS_midpoint_lat~Diet_B, data=dataN,
xlab=list(Diet of Breeding Ground, cex = 1.4),
ylab = list(Latitudinal Midpoint Breeding Ground ,cex = 1.4),
lwd=1.5,
cex.lab=1.4, cex.axis=1.2,
font.axis=2,
cex=1.5,
las=1,
panel=panel.bwplot.intermediate.hh,
bty=l,
col=Diet.colors,
pch=rep(l,13),
scales=list(x=list(rot=90)),
par.settings=list(plot.symbol = list(pch = 2, cex =
2),box.umbrella=list(lty=1)))

Elaine


On Sat, Sep 29, 2012 at 2:44 AM, ilai ke...@math.montana.edu wrote:

 On Fri, Sep 28, 2012 at 6:57 AM, Richard M. Heiberger r...@temple.eduwrote:

 Elaine,

 For panel.bwplot you see that the central dot and the outlier dots are
 controlled by
 the same pch argument.


 ??? I don't think so...

 bwplot(rgamma(20,.1,1)~gl(2,10), pch=rep(17,2),
 panel = lattice::panel.bwplot)

 I think you mean panel.bwplot.intermidiate.hh ?

 BTW thank you for the useful HH package but in this case OP is using it
 with no at argument, so why not

 Diet.colors - c(forestgreen, darkgreen,chocolate1,darkorange2,
 sienna2,red2,firebrick3,saddlebrown,coral4,chocolate4,darkblue,navy,grey38)
  bwplot(rgamma(20*13,1,.1)~gl(13,20),
   fill = Diet.colors, pch = |,
   par.settings = list(box.umbrella=list(lty=1)))

 cheers



 I initially set the pch=| to match your first
 example with the horizontal
 indicator for the median.  I would be inclined to use the default circle
 for the outliers and
 therefore also for the median.

 Rich

 On Fri, Sep 28, 2012 at 7:13 AM, Sarah Goslee sarah.gos...@gmail.com
 wrote:

  I would guess that if you find the bit that says pch=| and change it
 to
  pch=1 it will solve your question, and that reading ?par will tell you
 why.
 
  Sarah
 
  On Thursday, September 27, 2012, Elaine Kuo wrote:
 
   Hello
  
   This is Elaine.
  
   I am using package lattice to generate boxplots.
   Using Richard's code, the display was almost perfect except the
 outlier
   shape.
   Based on the following code, the outliers are vertical lines.
   However, I want the outliers to be empty circles.
   Please kindly help how to modify the code to change the outlier
 shapes.
   Thank you.
  
   code
   package (lattice)
  
   dataN - data.frame(GE_distance=rnorm(260),
  
   Diet_B=factor(rep(1:13, each=20)))
  
   Diet.colors - c(forestgreen,
 darkgreen,chocolate1,darkorange2,
  
sienna2,red2,firebrick3,saddlebrown,coral4,
  
chocolate4,darkblue,navy,grey38)
  
   levels(dataN$Diet_B) - Diet.colors
  
   bwplot(GE_distance ~ Diet_B, data=dataN,
  
  xlab=list(Diet of Breeding Ground, cex = 1.4),
  
  ylab = list(
  
Distance between Centers of B and NB Range (1000 km),
  
cex = 1.4),
  
  panel=panel.bwplot.intermediate.hh,
  
  col=Diet.colors,
  
  pch=rep(|,13),
  
  scales=list(x=list(rot=90)),
  
  par.settings=list(box.umbrella=list(lty=1)))
  
   [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org javascript:; mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
  --
  Sarah Goslee
  http://www.stringpage.com
  http://www.sarahgoslee.com
  http://www.functionaldiversity.org
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list

Re: [R] Heatmap Colors

2012-09-28 Thread David Winsemius

On Sep 28, 2012, at 3:16 PM, Nick Fankhauser wrote:

 Hello R-Users!
 
 I'm using a heatmap to visualize a matrix of values between -1 and 3.
 How can I set the colors so that white is zero, below zero is blue of 
 increasing intensity towards -1 and above zero is red of increasing intensity 
 towards red?
 
 I tried like this (using the marray and gplots packages from bioconductor):
 mcol - maPalette(low=blue, mid=white, high=red,k=100)
 heatmap.2(my_matrix, col=mcol)
 
 But white does not correspond to zero, because the value distribution is not 
 symmetrical, so that zero is not in the middle.
 Is it somehow possible to create a color palette with white centered at zero?

The way you stated it at the beginning, I thought you should want the palette 
centered at 1 rather than 0:

test - seq(-1,3, len=20)
shift.BR - colorRamp(c(blue,white, red), bias=2)((1:16)/16)
tpal - rgb(shift.BR, maxColorValue=255)
barplot(test,col = tpal)

Perhaps I was being led astray by a somewhat similar question on StackOverflow.

-- 
David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Heatmap Colors

2012-09-28 Thread David Winsemius

On Sep 28, 2012, at 4:52 PM, David Winsemius wrote:

 
 On Sep 28, 2012, at 3:16 PM, Nick Fankhauser wrote:
 
 Hello R-Users!
 
 I'm using a heatmap to visualize a matrix of values between -1 and 3.
 How can I set the colors so that white is zero, below zero is blue of 
 increasing intensity towards -1 and above zero is red of increasing 
 intensity towards red?
 
 I tried like this (using the marray and gplots packages from bioconductor):
 mcol - maPalette(low=blue, mid=white, high=red,k=100)
 heatmap.2(my_matrix, col=mcol)
 
 But white does not correspond to zero, because the value distribution is not 
 symmetrical, so that zero is not in the middle.
 Is it somehow possible to create a color palette with white centered at zero?
 
 The way you stated it at the beginning, I thought you should want the palette 
 centered at 1 rather than 0:

Oopps ... should have the number of breaks match the number of colors:

test - seq(-1,3, len=20)
shift.BR - colorRamp(c(blue,white, red), bias=2)((1:20)/20)
tpal - rgb(shift.BR, maxColorValue=255)
barplot(test,col = tpal)
 
-- 

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Errors in if statement

2012-09-28 Thread JiangZhengyu

Hi guys, I have many rows (1000) and columns (30) of geno matrix. I use the 
following loop and condition statement (adapted from someone else code). I 
always have an error below.  I was wondering if anyone knows what's the problem 
 how to fix it.  
Thanks,Zhengyu  ### geno matrix P1  P2  P3  P4 
1  2  2  3 2 
 2  2  2  1 1
1  2  1  2  NANA 2  3  4  5 ###
for(i in 1:4) {
 cat(i,)
 if(sum(geno[i,]!=2)3  sum(geno[i,]==1)=1  sum(geno[i,]==3)=1){
   tmp = 1
   }
} ### 1 2 Error in if (sum(geno[i, ] != 2)  3  sum(geno[i, ] == 1) 
= 1  sum(geno[i,  : 
  missing value where TRUE/FALSE needed
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Better way of Grouping?

2012-09-28 Thread arun
Hi,
You can also use grep() to subset:


LD-paste0(rep(rep(c(3,4),each=4),2),c(rep(L,8),rep(D,8)))
set.seed(1)
dat1-data.frame(LD=LD,value=sample(1:15,16,replace=TRUE))
dat2-within(dat1,{LD-as.character(LD)})
dat2[grepl(.*L,dat2$LD),] # subset all L values
dat2[grepl(.*D,dat2$LD),] # subset all D values
 dat2[grepl(3D,dat2$LD),]
dat2[grepl(4D,dat2$LD),]


A.K.




- Original Message -
From: Charles Determan Jr deter...@umn.edu
To: r-help@r-project.org
Cc: 
Sent: Friday, September 28, 2012 2:59 PM
Subject: [R] Better way of Grouping?

Hello R users,

This is more of a convenience question that I hope others might find useful
if there is a better answer.  I work with large datasets that requires
multiple parsing stages for different analysis.  For example, compare group
3 vs. group 4.  A more complicated comparison would be time B in group 3 of
group L with B in group 4 of group L.  I normally subset each group with
the following type of code.

data=read(...)

#L v D
L=data[LvD %in% c(L),]
D=data[LvD %in% c(D),]

#Groups 3 and 4 within L and D
group3L=L[group %in% c(3),]
group4L=L[group %in% c(3),]

group3D=D[group %in% c(3),]
group4D=D[group %in% c(3),]

#Times B, S45, FR2, FR8
you get the idea


Is there a more efficient way to subset groups?  Thanks for any insight.

Regards,
Charles

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Text mining? Text manipulation? Both? Predicting KRAS test results in cancer patients

2012-09-28 Thread Paul Miller
Happy Friday Everyone,
 
Hope Friday afternoon doesn't turn out to be a terrible time to post a 
question. I've been doing a little data mining of patient text medical records 
as of late. I started out trying to predict whether or not cancer patients had 
received KRAS mutation testing and did quite well with that. Now I'm trying to 
predict the results of KRAS testing (mutated vs. wild type). This is proving to 
be a little more difficult.
 
With the first classification task, I created counts of terms (e.g., kras, 
mutated) in the text medical records using the tm package and then used those 
counts to predict whether or not patients had had KRAS mutation testing. I 
tried a few different analyses here, but found that random forests worked the 
best.
 
Predicting the results of testing is harder though because of the way 
physicians and other healthcare professionals write about testing. For example, 
I'm finding phrases like KRAS mutation returned wild-type. In this example, 
if we're counting, we get 1 instance of kras, 1 instance of mutated, and 
one instance of wild. So you can see how it might be difficult to accurately 
predict the results of testing based on counts alone.
 
My question is how best to deal with this. Are there any R text mining packages 
or related software that would be particularly suited to my problem? I took a 
look at the CRAN Task View: Natural Language Processing and there were so many 
options I didn't really know where to start (and it's not even clear that an 
R-based solution will work best for my problem). Alternatively, is there any 
real chance one could simply write code that would be able to identify true 
references to the results of KRAS testing and then create counts only of what 
are likely to be true references?
 
I'd greatly appreciate it if someone could point me in the right direction.
 
Thanks,
 
Paul 
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Select Original and Duplicates

2012-09-28 Thread Adam Gabbert
That works. Thank you!

On Fri, Sep 28, 2012 at 4:22 PM, Rui Barradas ruipbarra...@sapo.pt wrote:

 Hello,

 Try the following.


 idx - duplicated(df) | duplicated(df, fromLast = TRUE)
 df[idx, ]

 Note that they are returned in their original order in the df.

 Hope this helps,

 Rui Barradas

 Em 28-09-2012 21:11, Adam Gabbert escreveu:

 I would like to select a all the duplicate rows of a data frame including
 the original.  Any help would be much appreciated.  This is where I'm at
 so
 far. Thanks.

 #Sample data frame:
 df - read.table(header=T, con - textConnection('
   label value
   A 4
   B 3
   C 6
   B 3
   B 1
   A 2
   A 4
   A 4
 '))
 close(con)

 # Duplicate entries
 df[duplicated(df),]

 # label value
 # B 3
 # A 4
 # A 4

 #I want to select all the rows that are duplicated including the original
 #This is the output I want
 # label value
 # B 3
 # B 3
 # A 4
 # A 4
 # A 4

 [[alternative HTML version deleted]]

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Select Original and Duplicates

2012-09-28 Thread arun
HI,

You can also try:
idx-data.frame(t(sapply(df,function(x) !is.na(match(x,x[duplicated(x)])
 df1-df[sapply(idx,function(x) all(x==TRUE)),]
df1
#  label value
#1 A 4
#2 B 3
#4 B 3
#7 A 4
#8 A 4

A.K.

- Original Message -
From: Rui Barradas ruipbarra...@sapo.pt
To: Adam Gabbert adamjgabb...@gmail.com
Cc: r-help@r-project.org
Sent: Friday, September 28, 2012 4:22 PM
Subject: Re: [R] Select Original and Duplicates

Hello,

Try the following.


idx - duplicated(df) | duplicated(df, fromLast = TRUE)
df[idx, ]

Note that they are returned in their original order in the df.

Hope this helps,

Rui Barradas

Em 28-09-2012 21:11, Adam Gabbert escreveu:
 I would like to select a all the duplicate rows of a data frame including
 the original.  Any help would be much appreciated.  This is where I'm at so
 far. Thanks.

 #Sample data frame:
 df - read.table(header=T, con - textConnection('
   label value
       A     4
       B     3
       C     6
       B     3
       B     1
       A     2
       A     4
       A     4
 '))
 close(con)

 # Duplicate entries
 df[duplicated(df),]

 # label value
 #     B     3
 #     A     4
 #     A     4

 #I want to select all the rows that are duplicated including the original
 #This is the output I want
 # label value
 #     B     3
 #     B     3
 #     A     4
 #     A     4
 #     A     4

     [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Converting array to matrix

2012-09-28 Thread farnoosh sheikhi
Hi,

I have a 3d array as below, I want to make this array to a matrix of p=50(rows) 
and n=20(columns) with the coverage values .
The code before the array is:

library(binom)
Loading required package: lattice
pi.seq-seq(from = 0.01, to = 0.5, by = 0.01)
no.seq-seq(from = 5, to = 100, by = 5)
cp.all = binom.coverage( p = pi.seq, n = no.seq , conf.level = 0.95, method = 
exact)


I basically want to plot this probability with filled. contour.  
Many thanks.

method    p   n  coverage
1     exact 0.01   5 0.9990199
2     exact 0.01  10 0.9957338
3     exact 0.01  15 0.9903702
4     exact 0.01  20 0.9831407
5     exact 0.01  25 0.9980493
6     exact 0.01  30 0.9966823
7     exact 0.01  35 0.9948463
8     exact 0.01  40 0.9925026
9     exact 0.01  45 0.9896219
10    exact 0.01  50 0.9861827
11    exact 0.01  55 0.9821712
12    exact 0.01  60 0.9775798
13    exact 0.01  65 0.9958308
14    exact 0.01  70 0.9945711
15    exact 0.01  75 0.9930800
16    exact 0.01  80 0.9913408
17    exact 0.01  85 0.9893386
18    exact 0.01  90 0.9870598
19    exact 0.01  95 0.9844924
20    exact 0.01 100 0.9816260
21    exact 0.02   5 0.9961576
22    exact 0.02  10 0.9838224
23    exact 0.02  15 0.9969606
24    exact 0.02  20 0.9929313
25    exact 0.02  25 0.9867566
26    exact 0.02  30 0.9782822
27    exact 0.02  35 0.9948918
28    exact 0.02  40 0.9917591
29    exact 0.02  45 0.9875780
30    exact 0.02  50 0.9822419
31    exact 0.02  55 0.9756698
32    exact 0.02  60 0.9929754
33    exact 0.02  65 0.9902072
34    exact 0.02  70 0.9867702
35    exact 0.02  75 0.9826010
36    exact 0.02  80 0.9776446
37    exact 0.02  85 0.9927058
38    exact 0.02  90 0.9904482
39    exact 0.02  95 0.9877327
40    exact 0.02 100 0.9845164
41    exact 0.03   5 0.9915279
42    exact 0.03  10 0.9972351
43    exact 0.03  15 0.9906286
44    exact 0.03  20 0.9789916
45    exact 0.03  25 0.9938142
46    exact 0.03  30 0.9880954
47    exact 0.03  35 0.9797802
48    exact 0.03  40 0.9933299
49    exact 0.03  45 0.9890462
50    exact 0.03  50 0.9831894
51    exact 0.03  55 0.9755598
52    exact 0.03  60 0.9908560
53    exact 0.03  65 0.9866943
54    exact 0.03  70 0.9813629
55    exact 0.03  75 0.9926775
56    exact 0.03  80 0.9896911
57    exact 0.03  85 0.9859049
58    exact 0.03  90 0.9812172
59    exact 0.03  95 0.9755343
60    exact 0.03 100 0.9893762
61    exact 0.04   5 0.9852420
62    exact 0.04  10 0.9937863
63    exact 0.04  15 0.9797082
64    exact 0.04  20 0.9925871
65    exact 0.04  25 0.9834784
66    exact 0.04  30 0.9936800
67    exact 0.04  35 0.9877867
68    exact 0.04  40 0.9789777
69    exact 0.04  45 0.9912599
70    exact 0.04  50 0.9855896
71    exact 0.04  55 0.9777638
72    exact 0.04  60 0.9901122
73    exact 0.04  65 0.9849824
74    exact 0.04  70 0.9781965
75    exact 0.04  75 0.9897956
76    exact 0.04  80 0.9852643
77    exact 0.04  85 0.9794261
78    exact 0.04  90 0.9899813
79    exact 0.04  95 0.9653302
80    exact 0.04 100 0.9641378
81    exact 0.05   5 0.9774075
82    exact 0.05  10 0.9884964
83    exact 0.05  15 0.9945327
84    exact 0.05  20 0.9840985
85    exact 0.05  25 0.9928351
86    exact 0.05  30 0.9843645
87    exact 0.05  35 0.9927483
88    exact 0.05  40 0.9861231
89    exact 0.05  45 0.9761385
90    exact 0.05  50 0.9882136
91    exact 0.05  55 0.9806825
92    exact 0.05  60 0.9902109
93    exact 0.05  65 0.9844774
94    exact 0.05  70 0.9766393
95    exact 0.05  75 0.9662306
96    exact 0.05  80 0.9650815
97    exact 0.05  85 0.9772934
98    exact 0.05  90 0.9755923
99    exact 0.05  95 0.9718140
100   exact 0.05 100 0.9826071
101   exact 0.06   5 0.9980297
102   exact 0.06  10 0.9811622
103   exact 0.06  15 0.9896401
104   exact 0.06  20 0.9943659
105   exact 0.06  25 0.9849507
106   exact 0.06  30 0.9920548
107   exact 0.06  35 0.9831689
108   exact 0.06  40 0.9909419
109   exact 0.06  45 0.9829932
110   exact 0.06  50 0.9906217
111   exact 0.06  55 0.9836566
112   exact 0.06  60 0.9663670
113   exact 0.06  65 0.9668145
114   exact 0.06  70 0.9630279
115   exact 0.06  75 0.9763348
116   exact 0.06  80 0.9716289
117   exact 0.06  85 0.9820840
118   exact 0.06  90 0.9772655
119   exact 0.06  95 0.9687703
120   exact 0.06 100 0.9680765
121   exact 0.07   5 0.9969201
122   exact 0.07  10 0.9964239
123   exact 0.07  15 0.9824673
124   exact 0.07  20 0.9892932
125   exact 0.07  25 0.9934691
126   exact 0.07  30 0.9837683
127   exact 0.07  35 0.9902956
128   exact 0.07  40 0.9801496
129   exact 0.07  45 0.9879752
130   exact 0.07  50 0.9779901
131   exact 0.07  55 0.9679391
132   exact 0.07  60 0.9640110
133   exact 0.07  65 0.9765091
134   exact 0.07  70 0.9702320
135   exact 0.07  75 0.9806132
136   exact 0.07  80 0.9553953
137   exact 0.07  85 0.9692733
138   exact 0.07  90 0.9656231
139   exact 0.07  95 0.9765780
140   exact 0.07 100 0.9715796
141   exact 0.08   5 0.9954747
142   exact 0.08  10 0.9941987
143   exact 0.08  15 0.9950303
144   exact 0.08  20 0.9816556
145   exact 0.08  25 0.9877073
146   exact 

Re: [R] Converting array to matrix

2012-09-28 Thread David Winsemius

On Sep 28, 2012, at 3:59 PM, farnoosh sheikhi wrote:

 Hi,
 
 I have a 3d array as below, I want to make this array to a matrix of 
 p=50(rows) and n=20(columns) with the coverage values .
 The code before the array is:

?matrix

mat - matrix(datfrm$coverage, 50, 20)
filled.contour(mat)  # untested


-- 
David
 library(binom)
 Loading required package: lattice
 pi.seq-seq(from = 0.01, to = 0.5, by = 0.01)
 no.seq-seq(from = 5, to = 100, by = 5)
 cp.all = binom.coverage( p = pi.seq, n = no.seq , conf.level = 0.95, method = 
 exact)
 
 
 I basically want to plot this probability with filled. contour.  
 Many thanks.
 
 methodp   n  coverage
 1 exact 0.01   5 0.9990199
 2 exact 0.01  10 0.9957338
 3 exact 0.01  15 0.9903702
 4 exact 0.01  20 0.9831407
 5 exact 0.01  25 0.9980493
 6 exact 0.01  30 0.9966823
 7 exact 0.01  35 0.9948463
 8 exact 0.01  40 0.9925026
 9 exact 0.01  45 0.9896219
 10exact 0.01  50 0.9861827
 11exact 0.01  55 0.9821712
 12exact 0.01  60 0.9775798
 13exact 0.01  65 0.9958308
 14exact 0.01  70 0.9945711
 15exact 0.01  75 0.9930800
 16exact 0.01  80 0.9913408
 17exact 0.01  85 0.9893386
 18exact 0.01  90 0.9870598
 19exact 0.01  95 0.9844924
 20exact 0.01 100 0.9816260
 21exact 0.02   5 0.9961576
 22exact 0.02  10 0.9838224
 23exact 0.02  15 0.9969606
 24exact 0.02  20 0.9929313
 25exact 0.02  25 0.9867566
 26exact 0.02  30 0.9782822
 27exact 0.02  35 0.9948918
 28exact 0.02  40 0.9917591
 29exact 0.02  45 0.9875780
 30exact 0.02  50 0.9822419
 31exact 0.02  55 0.9756698
 32exact 0.02  60 0.9929754
 33exact 0.02  65 0.9902072
 34exact 0.02  70 0.9867702
 35exact 0.02  75 0.9826010
 36exact 0.02  80 0.9776446
 37exact 0.02  85 0.9927058
 38exact 0.02  90 0.9904482
 39exact 0.02  95 0.9877327
 40exact 0.02 100 0.9845164
 41exact 0.03   5 0.9915279
 42exact 0.03  10 0.9972351
 43exact 0.03  15 0.9906286
 44exact 0.03  20 0.9789916
 45exact 0.03  25 0.9938142
 46exact 0.03  30 0.9880954
 47exact 0.03  35 0.9797802
 48exact 0.03  40 0.9933299
 49exact 0.03  45 0.9890462
 50exact 0.03  50 0.9831894
 51exact 0.03  55 0.9755598
 52exact 0.03  60 0.9908560
 53exact 0.03  65 0.9866943
 54exact 0.03  70 0.9813629
 55exact 0.03  75 0.9926775
 56exact 0.03  80 0.9896911
 57exact 0.03  85 0.9859049
 58exact 0.03  90 0.9812172
 59exact 0.03  95 0.9755343
 60exact 0.03 100 0.9893762
 61exact 0.04   5 0.9852420
 62exact 0.04  10 0.9937863
 63exact 0.04  15 0.9797082
 64exact 0.04  20 0.9925871
 65exact 0.04  25 0.9834784
 66exact 0.04  30 0.9936800
 67exact 0.04  35 0.9877867
 68exact 0.04  40 0.9789777
 69exact 0.04  45 0.9912599
 70exact 0.04  50 0.9855896
 71exact 0.04  55 0.9777638
 72exact 0.04  60 0.9901122
 73exact 0.04  65 0.9849824
 74exact 0.04  70 0.9781965
 75exact 0.04  75 0.9897956
 76exact 0.04  80 0.9852643
 77exact 0.04  85 0.9794261
 78exact 0.04  90 0.9899813
 79exact 0.04  95 0.9653302
 80exact 0.04 100 0.9641378
 81exact 0.05   5 0.9774075
 82exact 0.05  10 0.9884964
 83exact 0.05  15 0.9945327
 84exact 0.05  20 0.9840985
 85exact 0.05  25 0.9928351
 86exact 0.05  30 0.9843645
 87exact 0.05  35 0.9927483
 88exact 0.05  40 0.9861231
 89exact 0.05  45 0.9761385
 90exact 0.05  50 0.9882136
 91exact 0.05  55 0.9806825
 92exact 0.05  60 0.9902109
 93exact 0.05  65 0.9844774
 94exact 0.05  70 0.9766393
 95exact 0.05  75 0.9662306
 96exact 0.05  80 0.9650815
 97exact 0.05  85 0.9772934
 98exact 0.05  90 0.9755923
 99exact 0.05  95 0.9718140
 100   exact 0.05 100 0.9826071
 101   exact 0.06   5 0.9980297
 102   exact 0.06  10 0.9811622
 103   exact 0.06  15 0.9896401
 104   exact 0.06  20 0.9943659
 105   exact 0.06  25 0.9849507
 106   exact 0.06  30 0.9920548
 107   exact 0.06  35 0.9831689
 108   exact 0.06  40 0.9909419
 109   exact 0.06  45 0.9829932
 110   exact 0.06  50 0.9906217
 111   exact 0.06  55 0.9836566
 112   exact 0.06  60 0.9663670
 113   exact 0.06  65 0.9668145
 114   exact 0.06  70 0.9630279
 115   exact 0.06  75 0.9763348
 116   exact 0.06  80 0.9716289
 117   exact 0.06  85 0.9820840
 118   exact 0.06  90 0.9772655
 119   exact 0.06  95 0.9687703
 120   exact 0.06 100 0.9680765
 121   exact 0.07   5 0.9969201
 122   exact 0.07  10 0.9964239
 123   exact 0.07  15 0.9824673
 124   exact 0.07  20 0.9892932
 125   exact 0.07  25 0.9934691
 126   exact 0.07  30 0.9837683
 127   exact 0.07  35 0.9902956
 128   exact 0.07  40 0.9801496
 129   exact 0.07  45 0.9879752
 130   exact 0.07  50 0.9779901
 131   exact 0.07  55 0.9679391
 132   exact 0.07  60 0.9640110
 133   exact 0.07  65 0.9765091
 134   exact 0.07  70 0.9702320
 135   exact 0.07  75 0.9806132
 136   exact 0.07  80 

Re: [R] Errors in if statement

2012-09-28 Thread David Winsemius

On Sep 28, 2012, at 1:16 PM, JiangZhengyu wrote:

 
 Hi guys, I have many rows (1000) and columns (30) of geno matrix. I use 
 the following loop and condition statement (adapted from someone else code). 
 I always have an error below.  I was wondering if anyone knows what's the 
 problem  how to fix it.  

Boy, it surely looks like missing values are the problem. Have you read:

?sum

-- 
David.

 Thanks,Zhengyu  ### geno matrix P1  P2  P3  P4 
 1  2  2  3 2 
 2  2  2  1 1
 1  2  1  2  NANA 2  3  4  5 ###
 for(i in 1:4) {
 cat(i,)
 if(sum(geno[i,]!=2)3  sum(geno[i,]==1)=1  sum(geno[i,]==3)=1){
   tmp = 1
   }
 } ### 1 2 Error in if (sum(geno[i, ] != 2)  3  sum(geno[i, ] == 1) 
 = 1  sum(geno[i,  : 
  missing value where TRUE/FALSE needed
 
   [[alternative HTML version deleted]]

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.