date:20111115

Good morning Rob,

First off, thank you for providing a reproducible example. This is one
of those little tasks that R is pretty great at, but there exist
\infty ways to do so and it can be a little overwhelming for the
beginner: here's one with the base function ave():

cbind(ave(example[,2:4], example[,5]), id = example[,5])

This splits example according to the fifth column (id) and averages
the other values: we then stick another copy of the id back on the end
and are good to go.

The base function aggregate can do something similar:

aggregate(example[,2:4], by = example[,5, drop = F], mean)

Note that you need the little-publicized but super useful drop = F
command to make this one work.

There are other ways to do this with the plyr or doBy packages as
well, but this should get you started.

Hope it helps,

Michael

On Tue, Nov 15, 2011 at 5:52 AM, robgriffin247
robgriffin...@hotmail.com wrote:
 *The situation (or an example at least!)*

 example-data.frame(rep(letters[1:10]))
 colnames(example)[1]-(Letters)
 example$numb1-rnorm(10,1,1)
 example$numb2-rnorm(10,1,1)
 example$numb3-rnorm(10,1,1)
 example$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG232,CG441,CG232,CG125)

 *this produces something like this:*
  Letters     numb1      numb2        numb3    id
 1        a 0.8139130 -0.9775570 -0.002996244 CG234
 2        b 0.8268700  0.4980661  1.647717998 CG232
 3        c 0.2384088  1.0249684  0.120663273 CG441
 4        d 0.8215922  0.5686534  1.591208307 CG128
 5        e 0.7865918  0.5411476  0.838300185 CG125
 6        f 2.2385522  1.2668070  1.268005020 CG182
 7        g 0.7403965 -0.6224205  1.374641549 CG232
 8        h 0.2526634  1.0282978 -0.110449844 CG441
 9        i 1.9333444  1.6667486  2.937252363 CG232
 10       j 1.6996701  0.5964623  1.967870617 CG125

 *The Problem:*
 Some of these id's are repeated, I want to average the values for those rows
 within each column but obviously they have different numbers in the numbers
 column, and they also have different letters in the letters column, the
 letters are not necessary for my analysis, only the duplicated id's and the
 numb columns are important

 I also need to keep the existing dataframe so would like to build a new
 dataframe that averages the repeated values and keeps their id - my actual
 dataset is much more complex (271*13890) - but the solution to this can be
 expanded out to my main data set because there is just more columns of
 numbers and still only one alphanumeric id to keep in my example data, id
 CG232 occurs 3 times, CG441  CG125 occur twice, everthing else once so the
 new dataframe (from this example) there would be 3 number columns (numb1,
 numb2, numb3) and an id the numb column values would be the averages of the
 rows which had the same id

 so for example the new dataframe would contain an entry for CG125 which
 would be something like this:

 numb1    numb2    numb3       id
 1.2431     0.5688     1.403         CG125

 Just as a thought, all of the IDs start with CG so could I use then grep (?)
 to delete CG and replace it with 0, that way duplicated ids could be
 averaged as a number (they would be the same) but I still don’t know how to
 produce the new dataframe with the averaged rows in it...

 I hope this is clear enough! email me if you need further detail or even
 better, if you have a solution!!
 also sorry to be posting my second question in under 24hours but I seem to
 have become more than a little stuck – I was making such good progress with
 R!

 Rob

 (also I'm sorry if this appears more than once on the mailing list - I'm
 having some network  windows live issues so I'm not convinced previous
 attempts to send this have worked, but have no way of telling if they are
 just milling around in the internet somewhere as we speak and will decide to
 come out of hiding later!)

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/averaging-between-rows-with-repeated-data-tp4042513p4042513.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] averaging between rows with repeated data

Oh sorry -- my mistake with ave() -- I only checked the first row

drop = F is an optional argument to the function [ which tells it to
return one of what it began with, rather than simplifying.

E.g.,

X = matrix(1:9, 3)
is.matrix(X)
TRUE

is.matrix(X[,2:3])
TRUE

is.matrix(X[,3])
FALSE # Just a regular vector

is.matrix(X[,3,drop = F])
TRUE

Aggregate wants a list in that second slot and data frames are
secretly also lists, so keeping it as a data frame gives the desired
list.

Michael

On Tue, Nov 15, 2011 at 7:07 AM, Rob Griffin robgriffin...@hotmail.com wrote:
 Thanks Michael,
 That second (aggregate) option worked perfectly - the first (cbind)
 generated averages for each row between the columns. (rather than between
 rows for each column).
 I came so close with aggregate yesterday - it is only slightly different to
 one my attempts (of admittedly very many attempts) to solve it so feels good
 that I was going along the right lines at some point!

 Could you possibly explain what this drop=F term is doing?

 Rob
 (A very grateful and relieved phd student).

 (also if anyone fancies helping me with another problem I posted yesterday:
 http://r.789695.n4.nabble.com/correlations-between-columns-for-each-row-td4039193.html
 )


 -Original Message- From: R. Michael Weylandt
 Sent: Tuesday, November 15, 2011 12:46 PM
 To: robgriffin247
 Cc: r-help@r-project.org
 Subject: Re: [R] averaging between rows with repeated data

 Good morning Rob,

 First off, thank you for providing a reproducible example. This is one
 of those little tasks that R is pretty great at, but there exist

 \infty ways to do so and it can be a little overwhelming for the

 beginner: here's one with the base function ave():

 cbind(ave(example[,2:4], example[,5]), id = example[,5])

 This splits example according to the fifth column (id) and averages
 the other values: we then stick another copy of the id back on the end
 and are good to go.

 The base function aggregate can do something similar:

 aggregate(example[,2:4], by = example[,5, drop = F], mean)

 Note that you need the little-publicized but super useful drop = F
 command to make this one work.

 There are other ways to do this with the plyr or doBy packages as
 well, but this should get you started.

 Hope it helps,

 Michael

 On Tue, Nov 15, 2011 at 5:52 AM, robgriffin247
 robgriffin...@hotmail.com wrote:

 *The situation (or an example at least!)*

 example-data.frame(rep(letters[1:10]))
 colnames(example)[1]-(Letters)
 example$numb1-rnorm(10,1,1)
 example$numb2-rnorm(10,1,1)
 example$numb3-rnorm(10,1,1)

 example$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG232,CG441,CG232,CG125)

 *this produces something like this:*
  Letters     numb1      numb2        numb3    id
 1        a 0.8139130 -0.9775570 -0.002996244 CG234
 2        b 0.8268700  0.4980661  1.647717998 CG232
 3        c 0.2384088  1.0249684  0.120663273 CG441
 4        d 0.8215922  0.5686534  1.591208307 CG128
 5        e 0.7865918  0.5411476  0.838300185 CG125
 6        f 2.2385522  1.2668070  1.268005020 CG182
 7        g 0.7403965 -0.6224205  1.374641549 CG232
 8        h 0.2526634  1.0282978 -0.110449844 CG441
 9        i 1.9333444  1.6667486  2.937252363 CG232
 10       j 1.6996701  0.5964623  1.967870617 CG125

 *The Problem:*
 Some of these id's are repeated, I want to average the values for those
 rows
 within each column but obviously they have different numbers in the
 numbers
 column, and they also have different letters in the letters column, the
 letters are not necessary for my analysis, only the duplicated id's and
 the
 numb columns are important

 I also need to keep the existing dataframe so would like to build a new
 dataframe that averages the repeated values and keeps their id - my actual
 dataset is much more complex (271*13890) - but the solution to this can be
 expanded out to my main data set because there is just more columns of
 numbers and still only one alphanumeric id to keep in my example data, id
 CG232 occurs 3 times, CG441  CG125 occur twice, everthing else once so
 the
 new dataframe (from this example) there would be 3 number columns (numb1,
 numb2, numb3) and an id the numb column values would be the averages of
 the
 rows which had the same id

 so for example the new dataframe would contain an entry for CG125 which
 would be something like this:

 numb1    numb2    numb3       id
 1.2431     0.5688     1.403         CG125

 Just as a thought, all of the IDs start with CG so could I use then grep
 (?)
 to delete CG and replace it with 0, that way duplicated ids could be
 averaged as a number (they would be the same) but I still don’t know how
 to
 produce the new dataframe with the averaged rows in it...

 I hope this is clear enough! email me if you need further detail or even
 better, if you have a solution!!
 also sorry to be posting my second question in under 24hours but I seem to
 have become more than a little stuck – I was making such good progress
 with

Re: [R] Adding units to levelplot's colorkey



On Nov 14, 2011, at 7:20 PM, Carlisle Thacker wrote:

Thanks, Dennis.  Yes, I can do that, but that locks the physical  
units to locations of the labels.  I had hoped that there might be  
something a bit more flexible, like a subtitle or more general text.


If you would take the time to describe what you wanted you would save  
everybody's time, yours and ours. I'm sure that levelplot's help page  
refers you to the xyplot help page for details on key parameters. So  
here are a few of the parameter it makes available but the entirelist  
takes up several pages and is not reproduced here:


title
String or expression giving a title for the key.

cex.title
Zoom factor for the title.

lines.title
The amount of vertical space to be occupied by the title in lines (in  
multiples of itself). Defaults to 2.


--
David.

Carlisle

On 11/14/11 6:03 PM, Dennis Murphy wrote:

You don't show code or a reproducible example, so I guess you want a
general answer. Use the draw.colorkey() function inside the
levelplot() call. It takes an argument key =, which accepts a list of
arguments, including space, col, at, labels, tick.number, width and
height (see p. 155 of the Lattice book). More specifically, the  
labels

argument also accepts a list of values, with components at, labels,
cex, col, font, fontface and fontfamily. You could attach the units  
as

labels with a combination of at = and labels = inside the (outer)
labels argument, something like

myAt- c(1, 3, 5, 7, 9)
myLab- paste(myAt, 'cm')

levelplot( ...
   draw.colorkey(key = list(labels = list(at = myAt, labels =  
myLab)),

...)

If that doesn't work, then please provide a reproducible example.

HTH,
Dennis

On Mon, Nov 14, 2011 at 12:34 PM, Carlisle Thacker
carlisle.thac...@noaa.gov  wrote:
How to add units (e.g. cm) to the color key of a lattice  
levelplot?


The plots looks fantastic, but it would be nice to indicate  
somewhere near
the end of the color key that the values associated with its  
colors are in

centimeters or some other physical units.

The only thing I find is the possibility to specify the labels so  
that one

explicitly includes the units.  That leaves little flexibility for
positioning where this information appears.  Is there a better way?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about linear regression in R



On Nov 14, 2011, at 10:49 PM, Miles Yang wrote:


Hi all,
I wrote a r program as below:

x - 1:10
y - c(3,3,3,3,3,3,3,3,3,3)

fit - lm(log(y) ~ x)
summary(fit)

And I expect to get some error message from R, because y is  
constant.

But, I got the message as below:


You are asking R to tell you if the mean of the log of 3 was different  
than 0. It is.


--
David.



summary(fit)


Call:
lm(formula = log(y) ~ x)

Residuals:
  Min 1Q Median 3QMax
-6.802e-17 -3.933e-17 -1.063e-17  1.807e-17  1.530e-16

Coefficients:
 Estimate Std. Errort value Pr(|t|)
(Intercept)  1.099e+00  4.569e-17  2.404e+16   2e-16 ***
x   -1.275e-17  7.364e-18 -1.732e+000.122
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.688e-17 on 8 degrees of freedom
Multiple R-squared: 0.5794, Adjusted R-squared: 0.5269
F-statistic: 11.02 on 1 and 8 DF,  p-value: 0.01054

How could this be possible?
Did I missing something in my R code?

Best,
Miles
--
－－
Miles Yang
Mobile：+61-411-985-538
E-mail：miles2y...@gmail.com
－－

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove names and more from list with capture.output()



On Nov 14, 2011, at 11:49 PM, Sverre Stausland wrote:


Hi R users,

I end up with a list object after running an anova:


lm(speed ~ 1 + dist + speed:dist, data = cars) - Int
lm(speed ~ 1 + dist, data = cars) - NoInt
anova(Int, NoInt) - test
test - test[c(Df, F, Pr(F))][2,]
is.list(test)

[1] TRUE

test

 Df  FPr(F)
2 -1 18.512 8.481e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

I would like to use capture.output() when writing this information to
a text file, but I would like to only print the row, not the names (Df
FPr(F)), and not the significance codes. That is, I want the
text printed to my text file to be:

2 -1 18.512 8.481e-05 ***

the output of capture.output is just going to be a character vector  
and you should select the second element.


 vec - capture.output(test)
 vec[2]
[1] 2 -1 18.512 8.481e-05 ***


Is there a way to do this?

Thanks
Sverre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Creating Timeseries by manipulating data table

2011-11-15 Thread Chuske

Hi,

I'm new to R and tried a search but couldn't find what I was looking for.

I have some data as a csv file with columns:-

longditude latitude year month rainfall region

What I need to do is produce a monthly time series for each region, where
region is an integer id and where each time point in the series is the
monthly average of rainfall for each location in that region.

Basically I know how to read the data as:-

raindata-read.csv(Rainfall.csv)

but have no idea how to create the timeseries using R.

Any help would be appreciated.

Thanks

Chuske

--
View this message in context: 
http://r.789695.n4.nabble.com/Creating-Timeseries-by-manipulating-data-table-tp4042875p4042875.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] break error bars in ggplot2

2011-11-15 Thread Fischer, Felix

Hello,

i use ggplot to plot some measures including CIs as horizontal errorbars. I get 
an error when the scale limits are narrower than the boundaries of the error 
bar and hence the CIs are not plotted.

library(ggplot2)
df - data.frame(resp=c(1,2), k=c(1,2), se=c(1,2))

ggplot(df, aes(resp,y=k)) +  geom_point() + geom_errorbarh(aes(xmax = resp + 
se, xmin = resp - se)) +   scale_x_continuous(limits=c(-1,3))

Is there a way to plot the errorbars anyway? Setting xmax to the scale limit is 
not so good, I guess, because you couldn't determine whether the CI is wider 
than the scale limits or not.

Thanks a lot,
Best,
Felix



Dr. rer. nat. Dipl.-Psych. Felix Fischer
Institut für Sozialmedizin, Epidemiologie und Gesundheitsökonomie
Charité - Universitätsmedizin Berlin
Luisenstrasse 57
10117 Berlin

Tel: 030 450 529 104
Fax: 030 450 529 902
http://epidemiologie.charite.dehttp://epidemiologie.charite.de/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] averaging between rows with repeated data

2011-11-15 Thread Rob Griffin


Thanks Michael,
That second (aggregate) option worked perfectly - the first (cbind) 
generated averages for each row between the columns. (rather than between 
rows for each column).
I came so close with aggregate yesterday - it is only slightly different to 
one my attempts (of admittedly very many attempts) to solve it so feels good 
that I was going along the right lines at some point!


Could you possibly explain what this drop=F term is doing?

Rob
(A very grateful and relieved phd student).

(also if anyone fancies helping me with another problem I posted yesterday: 
http://r.789695.n4.nabble.com/correlations-between-columns-for-each-row-td4039193.html 
)



-Original Message- 
From: R. Michael Weylandt

Sent: Tuesday, November 15, 2011 12:46 PM
To: robgriffin247
Cc: r-help@r-project.org
Subject: Re: [R] averaging between rows with repeated data

Good morning Rob,

First off, thank you for providing a reproducible example. This is one
of those little tasks that R is pretty great at, but there exist

\infty ways to do so and it can be a little overwhelming for the

beginner: here's one with the base function ave():

cbind(ave(example[,2:4], example[,5]), id = example[,5])

This splits example according to the fifth column (id) and averages
the other values: we then stick another copy of the id back on the end
and are good to go.

The base function aggregate can do something similar:

aggregate(example[,2:4], by = example[,5, drop = F], mean)

Note that you need the little-publicized but super useful drop = F
command to make this one work.

There are other ways to do this with the plyr or doBy packages as
well, but this should get you started.

Hope it helps,

Michael

On Tue, Nov 15, 2011 at 5:52 AM, robgriffin247
robgriffin...@hotmail.com wrote:

*The situation (or an example at least!)*

example-data.frame(rep(letters[1:10]))
colnames(example)[1]-(Letters)
example$numb1-rnorm(10,1,1)
example$numb2-rnorm(10,1,1)
example$numb3-rnorm(10,1,1)
example$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG232,CG441,CG232,CG125)

*this produces something like this:*
 Letters numb1  numb2numb3id
1a 0.8139130 -0.9775570 -0.002996244 CG234
2b 0.8268700  0.4980661  1.647717998 CG232
3c 0.2384088  1.0249684  0.120663273 CG441
4d 0.8215922  0.5686534  1.591208307 CG128
5e 0.7865918  0.5411476  0.838300185 CG125
6f 2.2385522  1.2668070  1.268005020 CG182
7g 0.7403965 -0.6224205  1.374641549 CG232
8h 0.2526634  1.0282978 -0.110449844 CG441
9i 1.9333444  1.6667486  2.937252363 CG232
10   j 1.6996701  0.5964623  1.967870617 CG125

*The Problem:*
Some of these id's are repeated, I want to average the values for those 
rows
within each column but obviously they have different numbers in the 
numbers

column, and they also have different letters in the letters column, the
letters are not necessary for my analysis, only the duplicated id's and 
the

numb columns are important

I also need to keep the existing dataframe so would like to build a new
dataframe that averages the repeated values and keeps their id - my actual
dataset is much more complex (271*13890) - but the solution to this can be
expanded out to my main data set because there is just more columns of
numbers and still only one alphanumeric id to keep in my example data, id
CG232 occurs 3 times, CG441  CG125 occur twice, everthing else once so 
the

new dataframe (from this example) there would be 3 number columns (numb1,
numb2, numb3) and an id the numb column values would be the averages of 
the

rows which had the same id

so for example the new dataframe would contain an entry for CG125 which
would be something like this:

numb1numb2numb3   id
1.2431 0.5688 1.403 CG125

Just as a thought, all of the IDs start with CG so could I use then grep 
(?)

to delete CG and replace it with 0, that way duplicated ids could be
averaged as a number (they would be the same) but I still don’t know how 
to

produce the new dataframe with the averaged rows in it...

I hope this is clear enough! email me if you need further detail or even
better, if you have a solution!!
also sorry to be posting my second question in under 24hours but I seem to
have become more than a little stuck – I was making such good progress 
with

R!

Rob

(also I'm sorry if this appears more than once on the mailing list - I'm
having some network  windows live issues so I'm not convinced previous
attempts to send this have worked, but have no way of telling if they are
just milling around in the internet somewhere as we speak and will decide 
to

come out of hiding later!)

--
View this message in context: 
http://r.789695.n4.nabble.com/averaging-between-rows-with-repeated-data-tp4042513p4042513.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list

[R] How to plot hierarchical clustering with different colors?

2011-11-15 Thread yzzhao

Dear experts,

I would like to plot a hierarchical clustering of 300 items. I had a
distance matrix with dimension of 300*300. The 300 items were from 7 groups
which I would like to label with 7 different colours in the plot. 

h-hclust(as.dist(300_distance_matrix))
plot(h,hang=-1,cex=0.5, col=blue)

I used the above script to plot the result. The cluster was all blue, the
300 item names were all displayed blue. When I tried to specify the seven
different colours, I found there was no option to get the names to be
displayed in different colours. 

When I used graph and Rgraphviz packages, I could define a attribute list to
specify the different group colours and plot a colourful graph. Is there any
options in hierarchical clustering to do the same? Could someone kindly help
me with this problem? Thanks in advance.


Sincerely
Yan

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-plot-hierarchical-clustering-with-different-colors-tp4042734p4042734.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] A question:How to plot hierarchical clustering with different colors?

2011-11-15 Thread yzzhao


Dear experts,

I would like to plot a hierarchical clustering of 300 items. I had a  
distance matrix with dimension of 300*300. The 300 items were from 7  
groups which I would like to label with 7 different colours in the plot.



h-hclust(as.dist(300_distance_matrix))
plot(h,hang=-1,cex=0.5, col=blue)


I used the above script to plot the result. The cluster was all blue,  
the 300 item names were all displayed blue. When I tried to specify  
the seven different colours, I found there was no option to get the  
names to be displayed in different colours.


When I used graph and Rgraphviz packages, I could define a attribute  
list to specify the different group colours and plot a colourful  
graph. Is there any options in hierarchical clustering to do the same?  
Could someone kindly help me with this problem? Thanks in advance.



Sincerely
Yan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating Timeseries by manipulating data table

It's a big subject and various mechanisms exist, but you should
probably start by looking into the zoo package and the read.zoo()
function.

Hope that helps,

Michael

On Tue, Nov 15, 2011 at 8:03 AM, Chuske jrm...@ex.ac.uk wrote:
 Hi,

 I'm new to R and tried a search but couldn't find what I was looking for.

 I have some data as a csv file with columns:-

 longditude latitude year month rainfall region

 What I need to do is produce a monthly time series for each region, where
 region is an integer id and where each time point in the series is the
 monthly average of rainfall for each location in that region.

 Basically I know how to read the data as:-

 raindata-read.csv(Rainfall.csv)

 but have no idea how to create the timeseries using R.

 Any help would be appreciated.

 Thanks

 Chuske

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Creating-Timeseries-by-manipulating-data-table-tp4042875p4042875.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gsDesign

2011-11-15 Thread Liaw, Andy

Hi Dongli,

Questions about usage of specific contributed packages are best directed toward 
the package maintainer/author first, as they are likely the best sources of 
information, and they don't necessarily subscribe to or keep up with the daily 
deluge of R-help messages.

(In this particular case, I'm quite sure the package maintainer for gsDesign 
doesn't keep up with R-help.)

Best,
Andy
 

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Dongli Zhou
 Sent: Monday, November 14, 2011 6:13 PM
 To: Marc Schwartz
 Cc: r-help@r-project.org
 Subject: Re: [R] gsDesign
 
 Hi, Marc,
 
 Thank you very much for the reply. I'm using the gsDesign 
 function to create an object of type gsDesign. But the inputs 
 do not include the 'ratio' argument.
 
 Dongli 
 
 On Nov 14, 2011, at 5:50 PM, Marc Schwartz 
 marc_schwa...@me.com wrote:
 
  On Nov 14, 2011, at 4:11 PM, Dongli Zhou wrote:
  
  I'm trying to use gsDesign for a noninferiority trial with binary
  endpoint. Did anyone know how to specify the trial with 
 different sample
  sizes for two treatment groups? Thanks in advance!
  
  
  Hi,
  
  Presuming that you are using the nBinomial() function, see 
 the 'ratio' argument, which defines the desired sample size 
 ratio between the two groups.
  
  See ?nBinomial and the examples there, which does include 
 one using the 'ratio' argument.
  
  HTH,
  
  Marc Schwartz
  
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RODBC conectar MySQL con R

2011-11-15 Thread Usuario R

Hola,

Alguno ha usado el paquete RODBC para acceder a una BBDD MySQL desde R en
Windows?
Qu'e mas tengo que hacer a parte de:

1) Aniadir el Driver de MySQL a la lista de User DSN en Control panel -
Administrative tools - Data Sources(ODBC)
2) Testear que funciona la conexion
2) Ejecutar: ch - odbcConnect(mydsn,uid=myui,pwd=mypass)

Alguna idea?
Gracias
-- 
Patricia García González

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Creating Timeseries by manipulating data table

2011-11-15 Thread Gabor Grothendieck

On Tue, Nov 15, 2011 at 8:20 AM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
 It's a big subject and various mechanisms exist, but you should
 probably start by looking into the zoo package and the read.zoo()
 function.


Note that there is an entire vignette on read.zoo, as well.  See the
Reading Data in Zoo link here:
http://cran.r-project.org/web/packages/zoo/index.html
or from within R:
vignette(zoo-read)
If you haven't used zoo before you should read all 5 vignettes and the
help files.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] break error bars in ggplot2

2011-11-15 Thread Ben Bolker

Fischer, Felix Felix.Fischer at charite.de writes:

 
 Hello,
 
 i use ggplot to plot some measures including CIs as horizontal 
 errorbars. I get an error when the scale
 limits are narrower than the boundaries of the error bar and
  hence the CIs are not plotted.
 
 library(ggplot2)
 df - data.frame(resp=c(1,2), k=c(1,2), se=c(1,2))
 
 ggplot(df, aes(resp,y=k)) +  geom_point() + 
   geom_errorbarh(aes(xmax = resp + se, xmin = resp - se)) +  
 scale_x_continuous(limits=c(-1,3))
 
 Is there a way to plot the errorbars anyway? Setting 
 xmax to the scale limit is not so good, I guess, because
 you couldn't determine whether the CI is wider than the scale limits or not.

  I'm not sure I completely understand your last paragraph, but
I think you want to substitute

  coord_cartesian(xlim=c(-1,3))

for your scale_x_continuous() component; as discussed in the ggplot2
book, limits set on scales act differently than limits set on 
coordinate systems.  (I'm a little surprised you get an error, though.)

  There's a very active ggplot2 google group that might be best
for ggplot(2)-specific questions ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about linear regression in R

2011-11-15 Thread John Fox

Dear Miles,

Within rounding error, you got the right intercept, log(3); slope, 0;
residuals, all 0; residual standard error, 0; and standard errors of the
intercept and slope, both 0. The R^2 should have been undefined (i.e., 0/0),
but dividing one number that's 0 within rounding error by another gives a
wild result. Likewise for the omnibus F and t-test for the slope. The moral:
don't expect floating-point arithmetic on a computer to be perfectly
precise.

I hope this helps,
 John


John Fox
Senator William McMaster
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Miles Yang
 Sent: November-14-11 10:49 PM
 To: r-help-requ...@r-project.org; r-help@r-project.org
 Subject: [R] Question about linear regression in R
 
 Hi all,
 I wrote a r program as below:
 
 x - 1:10
 y - c(3,3,3,3,3,3,3,3,3,3)
 
 fit - lm(log(y) ~ x)
 summary(fit)
 
 And I expect to get some error message from R, because y is constant.
 But, I got the message as below:
 
  summary(fit)
 
 Call:
 lm(formula = log(y) ~ x)
 
 Residuals:
Min 1Q Median 3QMax
 -6.802e-17 -3.933e-17 -1.063e-17  1.807e-17  1.530e-16
 
 Coefficients:
   Estimate Std. Errort value Pr(|t|)
 (Intercept)  1.099e+00  4.569e-17  2.404e+16   2e-16 ***
 x   -1.275e-17  7.364e-18 -1.732e+000.122
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
 Residual standard error: 6.688e-17 on 8 degrees of freedom Multiple R-
 squared: 0.5794, Adjusted R-squared: 0.5269
 F-statistic: 11.02 on 1 and 8 DF,  p-value: 0.01054
 
 How could this be possible?
 Did I missing something in my R code?
 
 Best,
 Miles
 --
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 oMiles Yang
 Mobileo+61-411-985-538
 E-mailomiles2y...@gmail.com
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
 o
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] averaging between rows with repeated data



On Nov 15, 2011, at 6:46 AM, R. Michael Weylandt wrote:


Good morning Rob,

First off, thank you for providing a reproducible example. This is one
of those little tasks that R is pretty great at, but there exist

\infty ways to do so and it can be a little overwhelming for the

beginner: here's one with the base function ave():

cbind(ave(example[,2:4], example[,5]), id = example[,5])

This splits example according to the fifth column (id) and averages
the other values: we then stick another copy of the id back on the end
and are good to go.

The base function aggregate can do something similar:

aggregate(example[,2:4], by = example[,5, drop = F], mean)

Note that you need the little-publicized but super useful drop = F
command to make this one work.


The way I usually deal with that is to wrap list() around the by=  
argument  ... since I usually forget about this aggregate quirk and  
bet an error message complaining : 'by' must be a list. (drop=FALSE  
has the effect of keeping data.frame columns as lists too, so I am not  
disagreeing here.)


aggregate(example[,2:4], by = list(example[,5]), mean)

--
David.




There are other ways to do this with the plyr or doBy packages as
well, but this should get you started.

Hope it helps,

Michael

On Tue, Nov 15, 2011 at 5:52 AM, robgriffin247
robgriffin...@hotmail.com wrote:

*The situation (or an example at least!)*

example-data.frame(rep(letters[1:10]))
colnames(example)[1]-(Letters)
example$numb1-rnorm(10,1,1)
example$numb2-rnorm(10,1,1)
example$numb3-rnorm(10,1,1)
example$id- 
c 
(CG234 
,CG232 
,CG441,CG128,CG125,CG182,CG232,CG441,CG232,CG125)


*this produces something like this:*
 Letters numb1  numb2numb3id
1a 0.8139130 -0.9775570 -0.002996244 CG234
2b 0.8268700  0.4980661  1.647717998 CG232
3c 0.2384088  1.0249684  0.120663273 CG441
4d 0.8215922  0.5686534  1.591208307 CG128
5e 0.7865918  0.5411476  0.838300185 CG125
6f 2.2385522  1.2668070  1.268005020 CG182
7g 0.7403965 -0.6224205  1.374641549 CG232
8h 0.2526634  1.0282978 -0.110449844 CG441
9i 1.9333444  1.6667486  2.937252363 CG232
10   j 1.6996701  0.5964623  1.967870617 CG125

*The Problem:*
Some of these id's are repeated, I want to average the values for  
those rows
within each column but obviously they have different numbers in the  
numbers
column, and they also have different letters in the letters column,  
the
letters are not necessary for my analysis, only the duplicated id's  
and the

numb columns are important

I also need to keep the existing dataframe so would like to build a  
new
dataframe that averages the repeated values and keeps their id - my  
actual
dataset is much more complex (271*13890) - but the solution to this  
can be
expanded out to my main data set because there is just more columns  
of
numbers and still only one alphanumeric id to keep in my example  
data, id
CG232 occurs 3 times, CG441  CG125 occur twice, everthing else  
once so the
new dataframe (from this example) there would be 3 number columns  
(numb1,
numb2, numb3) and an id the numb column values would be the  
averages of the

rows which had the same id

so for example the new dataframe would contain an entry for CG125  
which

would be something like this:

numb1numb2numb3   id
1.2431 0.5688 1.403 CG125

Just as a thought, all of the IDs start with CG so could I use then  
grep (?)

to delete CG and replace it with 0, that way duplicated ids could be
averaged as a number (they would be the same) but I still don’t  
know how to

produce the new dataframe with the averaged rows in it...

I hope this is clear enough! email me if you need further detail or  
even

better, if you have a solution!!
also sorry to be posting my second question in under 24hours but I  
seem to
have become more than a little stuck – I was making such good  
progress with

R!

Rob

(also I'm sorry if this appears more than once on the mailing list  
- I'm
having some network  windows live issues so I'm not convinced  
previous
attempts to send this have worked, but have no way of telling if  
they are
just milling around in the internet somewhere as we speak and will  
decide to

come out of hiding later!)

--
View this message in context: 
http://r.789695.n4.nabble.com/averaging-between-rows-with-repeated-data-tp4042513p4042513.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal,

Re: [R] Adding units to levelplot's colorkey



On Nov 15, 2011, at 8:23 AM, Carlisle Thacker wrote:

Sorry that I was not clear.  I was asking how to add annotation to  
levelplot's colorkey, not the levelplot itself.  The only entry I  
can find from the help pages is via its labels.


Googling did yield this:   draw.colorkey() doesn't support a title  
for the legend.  So I presume there is also no support for any  
other annotation.


You are right. I failed to test my presumptions. Apologies extended.

--
David.



Thanks for your assistance.


On 11/15/11 7:54 AM, David Winsemius wrote:



On Nov 14, 2011, at 7:20 PM, Carlisle Thacker wrote:

Thanks, Dennis.  Yes, I can do that, but that locks the physical  
units to locations of the labels.  I had hoped that there might be  
something a bit more flexible, like a subtitle or more general text.


If you would take the time to describe what you wanted you would  
save everybody's time, yours and ours. I'm sure that levelplot's  
help page refers you to the xyplot help page for details on key  
parameters. So here are a few of the parameter it makes available  
but the entirelist takes up several pages and is not reproduced here:


title
String or expression giving a title for the key.

cex.title
Zoom factor for the title.

lines.title
The amount of vertical space to be occupied by the title in lines  
(in multiples of itself). Defaults to 2.




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Bootstrap values for hierarchical tree based on distaance matrix

2011-11-15 Thread ronny

I would like to get an hierarchical clustering tree with bootstrap values
indicated on the nodes, as in pvclust. The problem is that I have only
distance matrix instead of the raw data, required for pvclust. Is there a
way to get it?

fit1 - hclust(dist) # an object of class 'dist
plot(fit1) # dendogram without p values

library(pvclust)
fit2 - pvclust(raw.data, method.hclust=ward,
   method.dist=euclidean)
plot(fit2) # dendogram with p values



--
View this message in context: 
http://r.789695.n4.nabble.com/Bootstrap-values-for-hierarchical-tree-based-on-distaance-matrix-tp4042939p4042939.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding units to levelplot's colorkey

2011-11-15 Thread Carlisle Thacker

Sorry that I was not clear.  I was asking how to add annotation to 
levelplot's colorkey, not the levelplot itself.  The only entry I can 
find from the help pages is via its labels.

Googling did yield this:   draw.colorkey() doesn't support a title for 
the legend.  So I presume there is also no support for any other 
annotation.

Thanks for your assistance.


On 11/15/11 7:54 AM, David Winsemius wrote:

 On Nov 14, 2011, at 7:20 PM, Carlisle Thacker wrote:

 Thanks, Dennis.  Yes, I can do that, but that locks the physical 
 units to locations of the labels.  I had hoped that there might be 
 something a bit more flexible, like a subtitle or more general text.

 If you would take the time to describe what you wanted you would save 
 everybody's time, yours and ours. I'm sure that levelplot's help page 
 refers you to the xyplot help page for details on key parameters. So 
 here are a few of the parameter it makes available but the entirelist 
 takes up several pages and is not reproduced here:

 title
 String or expression giving a title for the key.

 cex.title
 Zoom factor for the title.

 lines.title
 The amount of vertical space to be occupied by the title in lines (in 
 multiples of itself). Defaults to 2.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RODBC conectar MySQL con R

2011-11-15 Thread Uwe Ligges




On 15.11.2011 14:34, Usuario R wrote:

Hola,

Alguno ha usado el paquete RODBC para acceder a una BBDD MySQL desde R en
Windows?
Qu'e mas tengo que hacer a parte de:

1) Aniadir el Driver de MySQL a la lista de User DSN en Control panel -
Administrative tools -  Data Sources(ODBC)
2) Testear que funciona la conexion
2) Ejecutar: ch- odbcConnect(mydsn,uid=myui,pwd=mypass)


1. This list is in English. (Ich antworte ja auch nicht auf Deutsch.)
2. Please read the list's posting guide (cited below).
3. What is you problem? We really need reproducible code, what you 
tried, and the error message you got. Your description above is a 
receipt that seems to be OK.


Best,
Uwe Ligges





Alguna idea?
Gracias



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gsDesign

2011-11-15 Thread Marc Schwartz

Hi Dongli,

Sorry for the delay in following up.

You might want to read the dsDesignManual.pdf document, which is available in 
the 'inst/doc' folder in the package source tarball on CRAN, or in the package 
'doc' directory in your R installation. Use:

  system.file(package = gsDesign)

to get the package top directory for your installation. The above file will be 
in the 'doc' sub-directory from there. It has more extensive worked examples 
than the default package manual.


Simple non-inferiority example from ?nBinomial, with 2:1 ratio:

n.Fix - nBinomial(p1 = .677, p2 = .677, delta0 = 0.07, ratio = 2)


 n.Fix
[1] 2056.671

# Adjust that *up* to an integer multiple of 3
n.Fix - 2058


# Change 'outtype' to 2 if you want to see per arm sample sizes
# eg:
 nBinomial(p1 = .677, p2 = .677, delta0 = 0.07, ratio = 2, outtype = 2)
$n1
[1] 685.5569

$n2
[1] 1371.114




# Simple default GS design using the fixed study design sample size from above, 
# which is not yet adjusted for interim analyses

 gsDesign(n.fix = n.Fix)
Asymmetric two-sided group sequential design with
90 % power and 2.5 % Type I Error.
Upper bound spending computations assume
trial continues if lower bound is crossed.

  Lower bounds  Upper bounds-
  Analysis   NZ   Nominal p Spend+  Z   Nominal p Spend++
 1  734 -0.240.4057 0.0148 3.010.0013  0.0013
 2 1468  0.940.8267 0.0289 2.550.0054  0.0049
 3 2202  2.000.9772 0.0563 2.000.0228  0.0188
 Total  0.1000 0.0250 
+ lower bound beta spending (under H1):
 Hwang-Shih-DeCani spending function with gamma = -2
++ alpha spending:
 Hwang-Shih-DeCani spending function with gamma = -4

Boundary crossing probabilities and expected sample size
assume any cross stops the trial

Upper boundary (power or Type I Error)
  Analysis
   Theta  1  2  3  Total   E{N}
  0. 0.0013 0.0049 0.0171 0.0233 1286.0
  0.0715 0.1412 0.4403 0.3185 0.9000 1628.4

Lower boundary (futility or Type II Error)
  Analysis
   Theta  1  2  3  Total
  0. 0.4057 0.4290 0.1420 0.9767
  0.0715 0.0148 0.0289 0.0563 0.1000


So rather than needing 2058 from the fixed design, you actually need 2202 (1468 
in one arm and 734 in the other).

I would urge you to read the manual I reference above and as Andy has noted in 
his reply, contact Keaven directly for further assistance with this package.

HTH,

Marc

On Nov 14, 2011, at 5:13 PM, Dongli Zhou wrote:

 Hi, Marc,
 
 Thank you very much for the reply. I'm using the gsDesign function to create 
 an object of type gsDesign. But the inputs do not include the 'ratio' 
 argument.
 
 Dongli 
 
 On Nov 14, 2011, at 5:50 PM, Marc Schwartz marc_schwa...@me.com wrote:
 
 On Nov 14, 2011, at 4:11 PM, Dongli Zhou wrote:
 
 I'm trying to use gsDesign for a noninferiority trial with binary
 endpoint. Did anyone know how to specify the trial with different sample
 sizes for two treatment groups? Thanks in advance!
 
 
 Hi,
 
 Presuming that you are using the nBinomial() function, see the 'ratio' 
 argument, which defines the desired sample size ratio between the two groups.
 
 See ?nBinomial and the examples there, which does include one using the 
 'ratio' argument.
 
 HTH,
 
 Marc Schwartz


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] grid.arrange, grid.layout - legend, global y axis title

2011-11-15 Thread Johannes Radinger

Hello,

I created several plot with ggplot2 dev mode.

Now I want to combine the plots in a grid
e.g. 2x2 with a fixed size of the output.

What I am doing at the moment is:

grid.newpage() 
pushViewport(viewport(layout = grid.layout(nrow=2, ncol=2,
widths = unit(c(7.5,6.5), cm),
heights = unit(rep(5, 2), cm
print(plot1, vp = viewport(layout.pos.row = 1, layout.pos.col = 1))
print(plot2, vp = viewport(layout.pos.row = 1, layout.pos.col = 2))
print(plot3, vp = viewport(layout.pos.row = 2, layout.pos.col = 1))
print(plot4, vp = viewport(layout.pos.row = 2, layout.pos.col = 2))


This is working well so far.

The y-axis are for all plots the same so I'd like to have a global y-axis title 
on the left side. How can that be done using my approach?

I also would like to add a global vertical legend for my plots below all plots.
The legend should show the to different symbols (same as for the single plots 
(ggplot2)). I also don't know how to do that.

I know that there is the function grid.arrange which can do both things but 
this isn't working because I am in the dev mode of ggplot. Then I get the 
error: Error: could not find function ggplotGrob.

I load my libraries the following way:

1) import data
2) load library(gridExtra)
3) load library(devtools)
   dev_mode(TRUE)
   library(ggplot2)
   library(reshape2)
4) produce plots
5) arrange the plots.

So what is the best way to proceed? 
Should I stay with the grid.layout approach and can I get there a global legend 
and a global y axis title?
Or how can I use grid.arrange, define the position of the gobal legend and set 
the single plot to a fixed size?

I hope that wasn't to complicated...

/Johannes
 
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading a specific column of a csv file in a loop

2011-11-15 Thread Juliet Hannah

In the solution below, what is the advantage of using 0L.

 M0 - read.csv(M1.csv, nrows = 1)[0L, ]

Thanks!

2011/11/8 Gabor Grothendieck ggrothendi...@gmail.com:
 2011/11/8 Sergio René Araujo Enciso araujo.enc...@gmail.com:
 Dear all:

 I have two larges files with 2000 columns. For each file I am
 performing a loop to extract the ith element of each file and create
 a data frame with both ith elements in order to perform further
 analysis. I am not extracting all the ith elements but only certain
 which I am indicating on a vector called d.

 See  an example of my  code below

 ### generate an example for the CSV files, the original files contain
 more than 2000 columns, here for the sake of simplicity they have only
 10 columns
 M1-matrix(rnorm(1000), nrow=100, ncol=10,
 dimnames=list(seq(1:100),letters[1:10]))
 M2-matrix(rnorm(1000), nrow=100, ncol=10,
 dimnames=list(seq(1:100),letters[1:10]))
 write.table(M1, file=M1.csv, sep=,)
 write.table(M2, file=M2.csv, sep=,)

 ### the vector containing the i elements to be read
 d-c(1,4,7,8)
 P1-read.table(M1.csv, header=TRUE)
 P2-read.table(M1.csv, header=TRUE)
 for (i in d) {
 M-data.frame(P1[i],P2[i])
 rm(list=setdiff(ls(),d))
 }

 As the files are quite large, I want to include read.table within
 the loop so as it only read the ith element. I know that there is
 the option colClasses for which I have to create a vector with zeros
 for all the columns I do not want to load. Nonetheless I have no idea
 how to make this vector to change in the loop, so as the only element
 with no zeros is the ith element following the vector d. Any ideas
 how to do this? Or is there anz other approach to load only an
 specific element?


 Its a bit messy if there are row names so lets generate M1.csv like this:

 write.csv(M1, file = M1.csv, row.names = FALSE)

 Then we can do this:

 nc - ncol(read.csv(M1.csv, nrows = 1))
 colClasses - replace(rep(NULL, nc), d, NA)
 M1.subset - read.csv(M1.csv, colClasses = colClasses)

 or using the same M1.csv that we just generated try this which uses
 sqldf with the H2 backend:

 library(sqldf)
 library(RH2)

 M0 - read.csv(M1.csv, nrows = 1)[0L, ]
 M1.subset.h2 - sqldf(c(insert into M0 (select * from csvread('M1.csv')),
        select a, d, g, h from M0))

 This is referred to as Alternative 3 in FAQ#10 Example 6a on the sqldf
 home page:
 http://sqldf.googlecode.com
 Alternative 1 and Alternative 2 listed there could also be tried.

 (Note that although sqldf has a read.csv.sql command we did not use it
 here since that command only works with the sqlite back end and the
 RSQLite driver has a max of 999 columns.)

 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [stats-rosuda-devel] Error .jcall(mxe, S, fit, c(autorun, -e, afn, -o, dirout, : java.lang.NoSuchMethodError: density.Params.readFromArgs([Ljava/lang/String; )Ljava/lang/String;

2011-11-15 Thread Simon Urbanek

Please ask the authors and/or discussion list of dism, it may be a bug in dism 
or incompatibility between the maxent you are using and the package, I can't 
check sine maxent has a restrictive license. Also please do not cross-post to 
multiple mailing lists.
Thanks,
Simon


On Nov 14, 2011, at 1:46 AM, ahwangyuwei wrote:

 Dear all,
 I get the error when I use maxent.jar:
 Error .jcall(mxe, S, fit, c(autorun, -e, afn, -o, dirout,  : 
   java.lang.NoSuchMethodError: 
 density.Params.readFromArgs([Ljava/lang/String;)Ljava/lang/String;
  
 sessionInfo() result: 
 R version 2.14.0 (2011-10-31)
 Platform: i386-pc-mingw32/i386 (32-bit)
 locale:
 [1] LC_COLLATE=Chinese_People's Republic of China.936 
 [2] LC_CTYPE=Chinese_People's Republic of China.936   
 [3] LC_MONETARY=Chinese_People's Republic of China.936
 [4] LC_NUMERIC=C n!
  bsp;
 
 [5] LC_TIME=Chinese_People's Republic of China.936
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base 
 other attached packages:
 [1] maptools_0.8-10 lattice_0.20-0  foreign_0.8-46  rJava_0.9-2
 [5] dismo_0.7-11raster_1.9-41   sp_0.9-91  
 loaded via a namespace (and not attached):
 [1] grid_2.14.0
  
 nb!
  sp;
 
 The details:
 I am using the Dism!
  o packag
 
 e.Dismo has a function 'maxent' that communi-
 
 cates with this program(MaxEnt).MaxEnt is available as a stand-alone Java 
 program. 
 It is normal when I execute the command :
 jar - paste(system.file(package=dismo), /java/maxent.jar, sep='') 
 when I execute the function:
 xm - maxent(predictors, pres_train, factors='biome').
 The R show the error.
  
  
 Java is correct installed ,version is 1.60._18.
 My R version is 2.14.0.
  I don't know how to solve the problem. Will you help me out?
 Thank you,all.
   Yuwei Wang
 Cnic,CAS
  
 ___
 stats-rosuda-devel mailing list
 stats-rosuda-de...@listserv.uni-augsburg.de
 http://mailman.rz.uni-augsburg.de/mailman/listinfo/stats-rosuda-devel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading a specific column of a csv file in a loop

2011-11-15 Thread Gabor Grothendieck

On Tue, Nov 15, 2011 at 9:44 AM, Juliet Hannah juliet.han...@gmail.com wrote:
 In the solution below, what is the advantage of using 0L.

  M0 - read.csv(M1.csv, nrows = 1)[0L, ]


As mentioned, you will find quite a bit of additional info on the
sqldf home page but to address the specific question regarding the use
of 0L in this code:

library(sqldf)
library(RH2)
M0 - read.csv(M1.csv, nrows = 1)[0L, ]
M1.subset.h2 - sqldf(c(insert into M0 (select * from csvread('M1.csv')),
   select a, d, g, h from M0))

in order to use H2's csvread function we must first create the table
into which csvread reads as csvread does not itself create tables. It
only fills in existing tables. In SQL the table creation is done with
a create statement

create table M0(a real, b real, ...etc.)

This creates a table with zero rows; however, with 2000 columns that
would be an enormous create statement as every one of the 2000 columns
would have to be listed; therefore, we just upload a zero row table,
M0, from R instead.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help to ... import the data from Excel

2011-11-15 Thread jim holtman

Part of my problem has to do with getting through the corporate
firewall to access the other program I have to download to use it.  I
just tried today and this is what I got:

 xls.getshlib()
Loading required package: tools
--- xls.getshlib running... ---
  - download.file from
'http://dl.dropbox.com/u/2602516/swissrpkg/bin/win32/shlib/xlsReadWrite_1.5.4_dll.zip'
(timeout: 60)
Error in copyOrDownload(url) :
  downloading 
'http://dl.dropbox.com/u/2602516/swissrpkg/bin/win32/shlib/xlsReadWrite_1.5.4_dll.zip'
to 'C:\DOCUME~1\kon9407\LOCALS~1\Temp\RtmpQR6rWi/xlsReadWrite.zip' failed
In addition: Warning message:
In download.file(url, fpzip.temp, method = internal, quiet = TRUE,  :
  cannot open: HTTP status was '403 Forbidden'

Enter a frame number, or 0 to exit

1: xls.getshlib()
2: copyOrDownload(url)


I think when I was using it in the past, I had some problem with
writing to multiple output sheets so I could create tabs on the
workbook, but not really sure what was happening at the time.  I have
had better luck with XLConnect in creating tabbed workbooks.

On Tue, Nov 15, 2011 at 6:06 AM, Hans-Peter Suter gcha...@gmail.com wrote:
 Jim,

 2011/10/15 jim holtman jholt...@gmail.com:
 You might also want to consider the XLConnect package.  I have had
 better luck reading/writing Excel files than with xlsReadWrite.

 XLConnect looks good but - as the xlsReadWrite author and planing to
 release a xlsx/64 bit successor - I'd be interested to learn what you
 mean with better luck reading/writing. Thanks a lot and

 Cheers,
 Hans-Peter




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ARMAtoAR function

2011-11-15 Thread Diego Fernando Lemus Polania

Good Morning.

I wonder if R have a funtion to convert an ARMA process to infinite AR
process, like
ARMAtoMA function wich convert an ARMA process to infinite MA process

-- 
Diego Fernando Lemus Polanía
Ingeniero Industrial
Universidad Nacional de Colombia
Sede Medellín

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem creating reference manuals from latex

2011-11-15 Thread Tyler Rinker

Duncan,

Thank you for your patience, time and expertise. You were 100% correct and the
problem has been resolved. I'm adding what I did as a windows user to complete
the list record for future searchers.

To download the inconsolata package (you may approach this several ways this
one seemed easiest ot me) Go to the command prompt and type:

mpm --verbose --install inconsolata

Thanks again Duncan! I appreciate it.

Tyler

Date: Tue, 15 Nov 2011 06:15:05 -0500
From: murdoch.dun...@gmail.com
To: tyler_rin...@hotmail.com
CC: r-help@r-project.org
Subject: Re: [R] Problem creating reference manuals from latex

On 11-11-14 10:25 PM, Tyler Rinker wrote:

Duncan,

Thank you for your reply. I was not clear about the Internet access. I do
have access, just at times I don't, hence the need to produce the manuals
from latex rather than simply using the Internet.

Please pardon my lack of knowledge around your response. You said I'd have
to install the inconsolata.sty from CTAN. How? Is this installed in an R
directory or a Tex directory? Do I use R to install it or latex or save the
file and drop into a particular folder (directory)?

It is a TeX package. You need to use the MikTeX package installer to
install it.

I usually set up MikTeX to do this automatically when it needs a new
package, but that requires an available Internet connection; you'll need
to do something manually. Start in the Start Menu item for MikTeX 2.9,
and find the package manager item. Run it, and choose to install the
inconsolata package.

Duncan Murdoch

I've used rseek and a simple google search which reveals a great deal about
inconsolata, unfortunately I am not grasping what I need to do.

Tyler

Date: Mon, 14 Nov 2011 21:59:10 -0500
From: murdoch.dun...@gmail.com
To: tyler_rin...@hotmail.com
CC: r-help@r-project.org
Subject: Re: [R] Problem creating reference manuals from latex

On 11-11-14 9:44 PM, Tyler Rinker wrote:

R Community,

I often am in need of viewing the reference manuals of packages and do
not have Internet access. I have used the code:

path- find.package('tm')
system(paste(shQuote(file.path(R.home(bin), R)),CMD,
Rd2pdf,shQuote(path)))

someone kindly provided from this help list to generate the manuals from
the latex files. This worked well with version R 2.13. After the upgrade
to R 2.14 I use this code (see below and get an error message I don't
understand). I'm pretty sure ! LaTeX Error: File `inconsolata.sty' not
found. is important but don't get it's significance. There's a post about
it here:
http://r.789695.n4.nabble.com/inconsolata-font-for-building-vignettes-with-R-devel-td3838176.html
but I am a windows user making this a moot point. I know this file is n
R font's file that Miktext needs to build the manual.

I'd like to be able generate the reference manuals again without the
Internet. While the code above worked in the past I'm open to alternative
methods.

You need to install the inconsolata.sty file. It is available on CTAN
(the TeX network, not the R one). You say you don't have Internet
access, so I don't know how you'll do this, but presumably there's a
way: you got MikTex installed somehow.

Duncan Murdoch

Version: R 2.14.0 2011-10-31
OS: Windows 7
Latex: MikTex 2.9

Thank you
Tyler Rinker

path- find.package('tm')
system(paste(shQuote(file.path(R.home(bin), R)),CMD,
Rd2pdf,shQuote(path)))
Hmm ... looks like a package
Converting parsed Rd's to LaTeX ...
Creating pdf output from LaTeX ...
Warning: running command
'C:\PROGRA~2\MIKTEX~1.9\miktex\bin\texi2dvi.exe --pdf Rd2.tex -I
C:/PROGRA~1/R/R-214~1.0/share/texmf/tex/latex -I
C:/PROGRA~1/R/R-214~1.0/share/texmf/bibtex/bst' had status 1
Error : running 'texi2dvi' on 'Rd2.tex' failed
LaTeX errors:
! LaTeX Error: File `inconsolata.sty' not found.
Type X to quit orRETURN to proceed,
or enter new name. (Default extension: sty)
! Emergency stop.
read *

l.267

! == Fatal error occurred, no output PDF file produced!
Error in running tools::texi2dvi
Warning message:
running command 'C:/PROGRA~1/R/R-214~1.0/bin/i386/R CMD Rd2pdf
C:/Users/Rinker/R/win-library/2.14/tm' had status 1

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal,

Re: [R] break error bars in ggplot2

2011-11-15 Thread Fischer, Felix

Dear Ben,

great, works fine! I guess, the error occurs because data outside the scale 
limits is thrown away as stated in ?coord_cartesian .

Thanks,

Felix


From: Ben Bolker 
bbolker_at_gmail.commailto:bbolker_at_gmail.com?Subject=Re:%20[R]%20break%20error%20bars%20in%20ggplot2
Date: Tue, 15 Nov 2011 13:44:44 +

Fischer, Felix Felix.Fischer at charite.de writes:


 Hello, http://tolstoy.newcastle.edu.au/R/e16/help/11/11/1444.html#1454qlink1

 i use ggplot to plot some measures including CIs as horizontal
 errorbars. I get an error when the scale
 limits are narrower than the boundaries of the error bar and
 hence the CIs are not plotted.

 library(ggplot2)
 df - data.frame(resp=c(1,2), k=c(1,2), se=c(1,2))

 ggplot(df, aes(resp,y=k)) + geom_point() +
 geom_errorbarh(aes(xmax = resp + se, xmin = resp - se)) +
 scale_x_continuous(limits=c(-1,3))

 Is there a way to plot the errorbars anyway? Setting
 xmax to the scale limit is not so good, I guess, because
 you couldn't determine whether the CI is wider than the scale limits or not.

  I'm not sure I completely understand your last paragraph, but I think you 
want to substitute

  coord_cartesian(xlim=c(-1,3))

for your scale_x_continuous() component; as discussed in the ggplot2 book, 
limits set on scales act differently than limits set on coordinate systems. 
(I'm a little surprised you get an error, though.)

  There's a very active ggplot2 google group that might be best for 
ggplot(2)-specific questions ...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Models with ordered and unordered factors

2011-11-15 Thread Catarina Miranda

Hello;

I am having a problems with the interpretation of models using ordered or
unordered predictors.
I am running models in lmer but I will try to give a simplified example
data set using lm.
Both in the example and in my real data set I use a predictor variable
referring to 3 consecutive days of an experiment. It is a factor, and I
thought it would be more correct to consider it ordered.
Below is my example code with my comments/ideas along it.
Can someone help me to understand what is happening?

Thanks a lot in advance;

Catarina Miranda


y-c(72,25,24,2,18,38,62,30,78,34,67,21,97,79,64,53,27,81)

Day-c(rep(Day 1,6),rep(Day 2,6),rep(Day 3,6))

dataf-data.frame(y,Day)

str(dataf) #Day is not ordered
#'data.frame':   18 obs. of  2 variables:
# $ y  : num  72 25 24 2 18 38 62 30 78 34 ...
# $ Day: Factor w/ 3 levels Day 1,Day 2,..: 1 1 1 1 1 1 2 2 2 2 ...

summary(lm(y~Day,data=dataf))  #Day 2 is not significantly different from
Day 1, but Day 3 is.
#
#Call:
#lm(formula = y ~ Day, data = dataf)
#
#Residuals:
#Min  1Q  Median  3Q Max
#-39.833 -14.458  -3.833  13.958  42.167
#
#Coefficients:
#Estimate Std. Error t value Pr(|t|)
#(Intercept)   29.833  9.755   3.058  0.00797 **
#DayDay 2  18.833 13.796   1.365  0.19234
#DayDay 3  37.000 13.796   2.682  0.01707 *
#---
#Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
#
#Residual standard error: 23.9 on 15 degrees of freedom
#Multiple R-squared: 0.3241, Adjusted R-squared: 0.234
#F-statistic: 3.597 on 2 and 15 DF,  p-value: 0.05297
#

dataf$Day-ordered(dataf$Day)

str(dataf) # Day 1Day 2Day 3
#'data.frame':   18 obs. of  2 variables:
# $ y  : num  72 25 24 2 18 38 62 30 78 34 ...
# $ Day: Ord.factor w/ 3 levels Day 1Day 2..: 1 1 1 1 1 1 2 2 2 2 ...

summary(lm(y~Day,data=dataf)) #Significances reversed (or Day.L and
Day.Q are not sinonimous Day 2 and Day 3?): Day 2 (.L) is
significantly different from Day 1, but Day 3 (.Q) isn't.

#Call:
#lm(formula = y ~ Day, data = dataf)
#
#Residuals:
#Min  1Q  Median  3Q Max
#-39.833 -14.458  -3.833  13.958  42.167
#
#Coefficients:
#Estimate Std. Error t value Pr(|t|)
#(Intercept)  48. 5.6322   8.601 3.49e-07 ***
#Day.L26.1630 9.7553   2.682   0.0171 *
#Day.Q-0.2722 9.7553  -0.028   0.9781
#---
#Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
#
#Residual standard error: 23.9 on 15 degrees of freedom
#Multiple R-squared: 0.3241, Adjusted R-squared: 0.234
#F-statistic: 3.597 on 2 and 15 DF,  p-value: 0.05297

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to indice the column in the data.frame

2011-11-15 Thread haohao Tsing

hi R,users
 Now I read a data from a txt file
newdata-read.table(1.txt)
in the 1.txt ,there are several column shown as below
1 3 4 5
2 3 5 6
4 5 6 7
so when I want analysis the second column
anadata-newdata$V2

but my question I can not use some certain variable to indice the column?
e.g
cmn=2
anadata-newdata$Vcmn

how can I finish this command ?can anyone help me ? thank yo .

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Controlling the precision of the digits printed

Has anyone come across the right combinations to print a limited number of
digits? My trial and error approach is taking too much time. Here is what I
have tried:

 

 op - options()

 a - c(1e-10,1,2,3,.5,.25)

 names(a) - c(A, B, C, D, E, F)

 # default

 a

  A   B   C   D   E   F 

1.0e-10 1.0e+00 2.0e+00 3.0e+00 5.0e-01 2.5e-01 

 options(digits = 4, scipen=5)

 # Doesn't print exponents but there are too many trailing digits

 a

   ABCDE
F 

0.01 1.00 2.00 3.00 0.50
0.25 

 

 options(digits = 3, scipen=4)

 # Now we are back to exponents

 a

  A   B   C   D   E   F 

1.0e-10 1.0e+00 2.0e+00 3.0e+00 5.0e-01 2.5e-01 

 

I would like the integers to print as integers (1,2,3). The larger fractions
to print something like .5000 or .2500. And the very small number to use
exponents (1.0e-10)

 

Is this possible?

 

Thank you.

 

Kevin


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to indice the column in the data.frame

Hi,

Look at ?[

anadata - newdata[, cmn]
## i.e., extract all rows (the first argument is empty of the 2 column
anadata - newdata[, 2]

Hope this helps,

Josh

On Tue, Nov 15, 2011 at 8:02 AM, haohao Tsing haohaor...@gmail.com wrote:
 hi R,users
  Now I read a data from a txt file
 newdata-read.table(1.txt)
 in the 1.txt ,there are several column shown as below
 1 3 4 5
 2 3 5 6
 4 5 6 7
 so when I want analysis the second column
 anadata-newdata$V2

 but my question I can not use some certain variable to indice the column?
 e.g
 cmn=2
 anadata-newdata$Vcmn

 how can I finish this command ?can anyone help me ? thank yo .

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to indice the column in the data.frame

On Tue, Nov 15, 2011 at 11:23 AM, Joshua Wiley jwiley.ps...@gmail.com wrote:
 Hi,

 Look at ?[

 anadata - newdata[, cmn]
 ## i.e., extract all rows (the first argument is empty of the 2 column
 anadata - newdata[, 2]

Or, if this is part of a more general problem and the column names are
not necessarily in sequence:

 newdata - data.frame(V1=1:3, V2=4:6, V3=7:9)
 newdata[, paste(V, 2, sep=)]
[1] 4 5 6

Sarah

 Hope this helps,

 Josh

 On Tue, Nov 15, 2011 at 8:02 AM, haohao Tsing haohaor...@gmail.com wrote:
 hi R,users
  Now I read a data from a txt file
 newdata-read.table(1.txt)
 in the 1.txt ,there are several column shown as below
 1 3 4 5
 2 3 5 6
 4 5 6 7
 so when I want analysis the second column
 anadata-newdata$V2

 but my question I can not use some certain variable to indice the column?
 e.g
 cmn=2
 anadata-newdata$Vcmn

 how can I finish this command ?can anyone help me ? thank yo .

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Controlling the precision of the digits printed

Hi Kevin,

I am not sure you will find anything other than manual tweaking, that
will vary between no decimals for integers, some for small fractions,
and scientific for very small.  You can also look at:  ?round
?format.  If this is for code/a report, you could make any formatting
you wanted with enough effort and those.

Best regards,

Josh

On Tue, Nov 15, 2011 at 8:18 AM, Kevin Burton rkevinbur...@charter.net wrote:
 Has anyone come across the right combinations to print a limited number of
 digits? My trial and error approach is taking too much time. Here is what I
 have tried:



 op - options()

 a - c(1e-10,1,2,3,.5,.25)

 names(a) - c(A, B, C, D, E, F)

 # default

 a

      A       B       C       D       E       F

 1.0e-10 1.0e+00 2.0e+00 3.0e+00 5.0e-01 2.5e-01

 options(digits = 4, scipen=5)

 # Doesn't print exponents but there are too many trailing digits

 a

           A            B            C            D            E
 F

 0.01 1.00 2.00 3.00 0.50
 0.25



 options(digits = 3, scipen=4)

 # Now we are back to exponents

 a

      A       B       C       D       E       F

 1.0e-10 1.0e+00 2.0e+00 3.0e+00 5.0e-01 2.5e-01



 I would like the integers to print as integers (1,2,3). The larger fractions
 to print something like .5000 or .2500. And the very small number to use
 exponents (1.0e-10)



 Is this possible?



 Thank you.



 Kevin


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Models with ordered and unordered factors

2011-11-15 Thread Bert Gunter

Ordered factors use orthogonal polynomial contrasts by default. The .L and
.Q stand for the linear and quadratic terms. Unordered factors use
treatment contrasts although (they're actually not contrasts), that are
interpreted as you described.

If you do not know what this means, you need to do some reading on linear
models/multiple regression. Try posting on   http://stats.stackexchange.com/
or, as always, consult your local statistician for help.  VR's MASS book
also contains a useful but terse discussion on these issues.

Cheers,
Bert

On Tue, Nov 15, 2011 at 7:00 AM, Catarina Miranda 
catarina.mira...@gmail.com wrote:

 Hello;

 I am having a problems with the interpretation of models using ordered or
 unordered predictors.
 I am running models in lmer but I will try to give a simplified example
 data set using lm.
 Both in the example and in my real data set I use a predictor variable
 referring to 3 consecutive days of an experiment. It is a factor, and I
 thought it would be more correct to consider it ordered.
 Below is my example code with my comments/ideas along it.
 Can someone help me to understand what is happening?

 Thanks a lot in advance;

 Catarina Miranda


 y-c(72,25,24,2,18,38,62,30,78,34,67,21,97,79,64,53,27,81)

 Day-c(rep(Day 1,6),rep(Day 2,6),rep(Day 3,6))

 dataf-data.frame(y,Day)

 str(dataf) #Day is not ordered
 #'data.frame':   18 obs. of  2 variables:
 # $ y  : num  72 25 24 2 18 38 62 30 78 34 ...
 # $ Day: Factor w/ 3 levels Day 1,Day 2,..: 1 1 1 1 1 1 2 2 2 2 ...

 summary(lm(y~Day,data=dataf))  #Day 2 is not significantly different from
 Day 1, but Day 3 is.
 #
 #Call:
 #lm(formula = y ~ Day, data = dataf)
 #
 #Residuals:
 #Min  1Q  Median  3Q Max
 #-39.833 -14.458  -3.833  13.958  42.167
 #
 #Coefficients:
 #Estimate Std. Error t value Pr(|t|)
 #(Intercept)   29.833  9.755   3.058 0.00797 **
 #DayDay 2  18.833 13.796   1.365  0.19234
 #DayDay 3  37.000 13.796   2.682  0.01707 *
 #---
 #Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
 #
 #Residual standard error: 23.9 on 15 degrees of freedom
 #Multiple R-squared: 0.3241, Adjusted R-squared: 0.234
 #F-statistic: 3.597 on 2 and 15 DF,  p-value: 0.05297
 #

 dataf$Day-ordered(dataf$Day)

 str(dataf) # Day 1Day 2Day 3
 #'data.frame':   18 obs. of  2 variables:
 # $ y  : num  72 25 24 2 18 38 62 30 78 34 ...
 # $ Day: Ord.factor w/ 3 levels Day 1Day 2..: 1 1 1 1 1 1 2 2 2 2 ...

 summary(lm(y~Day,data=dataf)) #Significances reversed (or Day.L and
 Day.Q are not sinonimous Day 2 and Day 3?): Day 2 (.L) is
 significantly different from Day 1, but Day 3 (.Q) isn't.

 #Call:
 #lm(formula = y ~ Day, data = dataf)
 #
 #Residuals:
 #Min  1Q  Median  3Q Max
 #-39.833 -14.458  -3.833  13.958  42.167
 #
 #Coefficients:
 #Estimate Std. Error t value Pr(|t|)
 #(Intercept)  48. 5.6322   8.601 3.49e-07 ***
 #Day.L26.1630 9.7553   2.682   0.0171 *
 #Day.Q-0.2722 9.7553  -0.028   0.9781
 #---
 #Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
 #
 #Residual standard error: 23.9 on 15 degrees of freedom
 #Multiple R-squared: 0.3241, Adjusted R-squared: 0.234
 #F-statistic: 3.597 on 2 and 15 DF,  p-value: 0.05297

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to indice the column in the data.frame



On Nov 15, 2011, at 11:02 AM, haohao Tsing wrote:


hi R,users
Now I read a data from a txt file
newdata-read.table(1.txt)
in the 1.txt ,there are several column shown as below
1 3 4 5
2 3 5 6
4 5 6 7
so when I want analysis the second column
anadata-newdata$V2

but my question I can not use some certain variable to indice the  
column?

e.g
cmn=2
anadata-newdata$Vcmn


Either:


anadata-newdata[[ paste(V, cmn, sep=) ]]


Or:


anadata-newdata[[cmn]]





how can I finish this command ?can anyone help me ? thank yo .


Your should go back and study your introductory manual now. If this is  
not near the begininning of that manual, you hsoulswitch manuals.


You should also type:

?$

And read that page carefully and try all of the examples. It will  
probably take more than one reading to master its contents, but it is  
a core part of learning R.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Models with ordered and unordered factors

2011-11-15 Thread Bert Gunter

... In addition, the following may also be informative.

 f - paste(day, 1:3)
 contrasts(ordered(f))
.L .Q
[1,] -7.071068e-01  0.4082483
[2,] -7.850462e-17 -0.8164966
[3,]  7.071068e-01  0.4082483

 contrasts(factor(f))
  day 2 day 3
day 1 0 0
day 2 1 0
day 3 0 1

Cheers,
Bert

On Tue, Nov 15, 2011 at 8:32 AM, Bert Gunter bgun...@gene.com wrote:

 Ordered factors use orthogonal polynomial contrasts by default. The .L and
 .Q stand for the linear and quadratic terms. Unordered factors use
 treatment contrasts although (they're actually not contrasts), that are
 interpreted as you described.

 If you do not know what this means, you need to do some reading on linear
 models/multiple regression. Try posting on
 http://stats.stackexchange.com/  or, as always, consult your local
 statistician for help.  VR's MASS book also contains a useful but terse
 discussion on these issues.

 Cheers,
 Bert

 On Tue, Nov 15, 2011 at 7:00 AM, Catarina Miranda 
 catarina.mira...@gmail.com wrote:

 Hello;

 I am having a problems with the interpretation of models using ordered or
 unordered predictors.
 I am running models in lmer but I will try to give a simplified example
 data set using lm.
 Both in the example and in my real data set I use a predictor variable
 referring to 3 consecutive days of an experiment. It is a factor, and I
 thought it would be more correct to consider it ordered.
 Below is my example code with my comments/ideas along it.
 Can someone help me to understand what is happening?

 Thanks a lot in advance;

 Catarina Miranda


 y-c(72,25,24,2,18,38,62,30,78,34,67,21,97,79,64,53,27,81)

 Day-c(rep(Day 1,6),rep(Day 2,6),rep(Day 3,6))

 dataf-data.frame(y,Day)

 str(dataf) #Day is not ordered
 #'data.frame':   18 obs. of  2 variables:
 # $ y  : num  72 25 24 2 18 38 62 30 78 34 ...
 # $ Day: Factor w/ 3 levels Day 1,Day 2,..: 1 1 1 1 1 1 2 2 2 2 ...

 summary(lm(y~Day,data=dataf))  #Day 2 is not significantly different from
 Day 1, but Day 3 is.
 #
 #Call:
 #lm(formula = y ~ Day, data = dataf)
 #
 #Residuals:
 #Min  1Q  Median  3Q Max
 #-39.833 -14.458  -3.833  13.958  42.167
 #
 #Coefficients:
 #Estimate Std. Error t value Pr(|t|)
 #(Intercept)   29.833  9.755   3.058 0.00797 **
 #DayDay 2  18.833 13.796   1.365  0.19234
 #DayDay 3  37.000 13.796   2.682  0.01707 *
 #---
 #Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
 #
 #Residual standard error: 23.9 on 15 degrees of freedom
 #Multiple R-squared: 0.3241, Adjusted R-squared: 0.234
 #F-statistic: 3.597 on 2 and 15 DF,  p-value: 0.05297
 #

 dataf$Day-ordered(dataf$Day)

 str(dataf) # Day 1Day 2Day 3
 #'data.frame':   18 obs. of  2 variables:
 # $ y  : num  72 25 24 2 18 38 62 30 78 34 ...
 # $ Day: Ord.factor w/ 3 levels Day 1Day 2..: 1 1 1 1 1 1 2 2 2 2
 ...

 summary(lm(y~Day,data=dataf)) #Significances reversed (or Day.L and
 Day.Q are not sinonimous Day 2 and Day 3?): Day 2 (.L) is
 significantly different from Day 1, but Day 3 (.Q) isn't.

 #Call:
 #lm(formula = y ~ Day, data = dataf)
 #
 #Residuals:
 #Min  1Q  Median  3Q Max
 #-39.833 -14.458  -3.833  13.958  42.167
 #
 #Coefficients:
 #Estimate Std. Error t value Pr(|t|)
 #(Intercept)  48. 5.6322   8.601 3.49e-07 ***
 #Day.L26.1630 9.7553   2.682   0.0171 *
 #Day.Q-0.2722 9.7553  -0.028   0.9781
 #---
 #Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1
 #
 #Residual standard error: 23.9 on 15 degrees of freedom
 #Multiple R-squared: 0.3241, Adjusted R-squared: 0.234
 #F-statistic: 3.597 on 2 and 15 DF,  p-value: 0.05297

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:

 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm





-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Controlling the precision of the digits printed

2011-11-15 Thread William Dunlap

When you print a vector R uses a single
format for the whole vector and tries to
come up with one format that displays all the values
accurately enough.  For a matrix (or data.frame)
it uses a different format for each column, so perhaps
you would like the output of:

   matrix(a, nrow=1, dimnames=list(, names(a)))
   A B C D   EF
   1e-10 1 2 3 0.5 0.25

Now you said you wanted a minimum of 4 digits after
the decimal point for large fractions like 0.25
but only 2 when using scientific notation for small
fractions like 1.0e-10 and you didn't say what you
wanted for big numbers like pi*10^10.  That rule seems
complicated enough that you may want to write your
own print function based on sprintf().

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Kevin Burton
 Sent: Tuesday, November 15, 2011 8:19 AM
 To: r-help@r-project.org
 Subject: [R] Controlling the precision of the digits printed
 
 Has anyone come across the right combinations to print a limited number of
 digits? My trial and error approach is taking too much time. Here is what I
 have tried:
 
 
 
  op - options()
 
  a - c(1e-10,1,2,3,.5,.25)
 
  names(a) - c(A, B, C, D, E, F)
 
  # default
 
  a
 
   A   B   C   D   E   F
 
 1.0e-10 1.0e+00 2.0e+00 3.0e+00 5.0e-01 2.5e-01
 
  options(digits = 4, scipen=5)
 
  # Doesn't print exponents but there are too many trailing digits
 
  a
 
ABCDE
 F
 
 0.01 1.00 2.00 3.00 0.50
 0.25
 
 
 
  options(digits = 3, scipen=4)
 
  # Now we are back to exponents
 
  a
 
   A   B   C   D   E   F
 
 1.0e-10 1.0e+00 2.0e+00 3.0e+00 5.0e-01 2.5e-01
 
 
 
 I would like the integers to print as integers (1,2,3). The larger fractions
 to print something like .5000 or .2500. And the very small number to use
 exponents (1.0e-10)
 
 
 
 Is this possible?
 
 
 
 Thank you.
 
 
 
 Kevin
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] With an example - Re: rbind.data.frame drops attributes for factor variables

2011-11-15 Thread Sammy Zee

Thanks. Yes, I meant nrow(dataset)+1 (typo...)

Sammy

On Mon, Nov 14, 2011 at 1:29 AM, Petr PIKAL petr.pi...@precheza.cz wrote:

 
  dataset[ nrow(dataset), ] - c (Male, 5, bad)
 
  The above seems to have worked to append a row in place of a rbind().
 This

 No. It overwrites your last row. You maybe meant

 dataset[ nrow(dataset)+1, ] - c (Male, 5, bad)

 Regards
 Petr


  method does not drop the custom attributes from the column. Do yo see
 any
  issue with this method.
 
  Thanks,
  Sammy
 
  On Sat, Nov 12, 2011 at 10:16 PM, David Winsemius
 dwinsem...@comcast.netwrote:
 
  
   On Nov 12, 2011, at 6:40 PM, Sammy Zee wrote:
  
Thanks David. Besides rbind(), is there any other way to add a row to
 a
   data frame so that I do not lose the custom attributes.
  
  
   I have already told you the method that I know of. You don't seem to
 have
   taken my poin that it is not a data.frame specific problem but rahter
 a
   facor problem. You are welcome to redefine `rbind.data.frame`. The R
   language is rather flexible in that manner.
  
   --
   David.
  
  
   Thanks,
   Sammy
  
   On Sat, Nov 12, 2011 at 5:17 PM, David Winsemius
 dwinsem...@comcast.net
   wrote:
  
   On Nov 12, 2011, at 2:47 PM, Sammy Zee wrote:
  
   When I use rbind() or rbind.data.frame() to add a row to an existing
   dataframe, it appears that attributes for the column of type factor
 are
   dropped. See the sample example below to reproduce the problem.
 Please
   suggest How I can fix this.
  
  
   Thanks,
   Sammy
  
   a=c(Male, Male, Female, Male)
   b=c(1,2,3,4)
   c=c(great, bad, good, bad)
   dataset- data.frame (gender = a, count = b, answer = c)
  
   dataset
  
   gender count answer
   1   Male 1  great
   2   Male 2bad
   3 Female 3   good
   4   Male 4bad
  
  
   attributes(dataset$answer)
   $levels
   [1] bad   good  great
  
   $class
   [1] factor
  
   Now adding some custom attributes to column dataset$answer
  
   attributes(dataset$answer)-c(**attributes(dataset$answer),**
   list(newattr1=custom-attr1))
   attributes(dataset$answer)-c(**attributes(dataset$answer),**
   list(newattr2=custom-attr2))
  
   If you look through the code of rbind.data.frame you see that column
   values are processed with the 'factor' function.
  
  
attributes(dataset$answer)
   $levels
   [1] bad   good  great
  
   $class
   [1] factor
  
   $newattr1
   [1] custom-attr1
  
   $newattr2
   [1] custom-attr2
  
attributes(factor(dataset$**answer))
  
   $levels
   [1] bad   good  great
  
   $class
   [1] factor
  
  
   So I think you are out of luck. You will need to restore the special
   attributes yourself.
  
   --
   David.
  
  
   attributes(dataset$answer)
   $levels
   [1] bad   good  great
  
   $class
   [1] factor
  
   $newattr1
   [1] custom-attr1
  
   $newattr2
   [1] custom-attr2
  
   However as soon as I add a row to this data frame (dataset) by
 rbind(),
   it loses the custom
   attributes (newattr1 and newattr2) I have just added
  
   newrow = c(gender=Female, count = 5, answer = great)
  
   dataset - rbind(dataset, newrow)
  
   attributes(dataset$answer)
   $levels
   [1] bad   good  great
  
   $class
   [1] factor
  
   the two custom attributes are dropped!! Any suggestion why this is
   happening.
  
   On Fri, Nov 11, 2011 at 11:44 AM, Jeff Newmiller
   jdnew...@dcn.davis.ca.us**wrote:
  
   As the doctor says, if it hurts don't do that.
  
   A factor is a sequence of integers with a corresponding list of
 character
   strings. Factors in two separate vectors can and usually do map the
 same
   integer to different strings, and R cannot tell how you want that
   resolved.
  
   Convert these columns to character before combining them, and only
 convert
   to factor when you have all of your possibilities present (or you
 specify
   them in the creation of the factor vector).
   --**--**
   ---
   Jeff NewmillerThe .   .  Go
   Live...
  
   Sammy Zee szee2...@gmail.com wrote:
  
   Hi all,
  
   When I use rbind() or rbind.data.frame() to add a row to an existing
   dataframe, it appears that attributes for the column of type factor
   are
   dropped. I see the following post with same problem. However i did
 not
   see
   any reply to the following posting offering a solution. Could someone
   please help.
  
  
   http://r.789695.n4.nabble.com/**rbind-data-frame-drops-**
   attributes-for-factor-**variables-td919575.htmlhttp://r.
  789695.n4.nabble.com/rbind-data-frame-drops-attributes-for-factor-
  variables-td919575.html
  
   Thanks,
   Sammy
  
  [[alternative HTML version deleted]]
  
   ___
  
  
   David Winsemius, MD
   West Hartford, CT
  
  
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read

Re: [R] Models with ordered and unordered factors

2011-11-15 Thread Paul Johnson

On Tue, Nov 15, 2011 at 9:00 AM, Catarina Miranda
catarina.mira...@gmail.com wrote:
 Hello;

 I am having a problems with the interpretation of models using ordered or
 unordered predictors.
 I am running models in lmer but I will try to give a simplified example
 data set using lm.
 Both in the example and in my real data set I use a predictor variable
 referring to 3 consecutive days of an experiment. It is a factor, and I
 thought it would be more correct to consider it ordered.
 Below is my example code with my comments/ideas along it.
 Can someone help me to understand what is happening?

Dear Catarina:

I have had the same question, and I hope my answers help you
understand what's going on.

The short version:

http://pj.freefaculty.org/R/WorkingExamples/orderedFactor-01.R

The longer version, Working with Ordinal Predictors

http://pj.freefaculty.org/ResearchPapers/MidWest09/Midwest09.pdf

HTH
pj


 Thanks a lot in advance;

 Catarina Miranda


 y-c(72,25,24,2,18,38,62,30,78,34,67,21,97,79,64,53,27,81)

 Day-c(rep(Day 1,6),rep(Day 2,6),rep(Day 3,6))

 dataf-data.frame(y,Day)

 str(dataf) #Day is not ordered
 #'data.frame':   18 obs. of  2 variables:
 # $ y  : num  72 25 24 2 18 38 62 30 78 34 ...
 # $ Day: Factor w/ 3 levels Day 1,Day 2,..: 1 1 1 1 1 1 2 2 2 2 ...

 summary(lm(y~Day,data=dataf))  #Day 2 is not significantly different from
 Day 1, but Day 3 is.
 #
 #Call:
 #lm(formula = y ~ Day, data = dataf)
 #
 #Residuals:
 #    Min      1Q  Median      3Q     Max
 #-39.833 -14.458  -3.833  13.958  42.167
 #
 #Coefficients:
 #            Estimate Std. Error t value Pr(|t|)
 #(Intercept)   29.833      9.755   3.058 0.00797 **
 #DayDay 2      18.833     13.796   1.365  0.19234
 #DayDay 3      37.000     13.796   2.682  0.01707 *
 #---
 #Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 #
 #Residual standard error: 23.9 on 15 degrees of freedom
 #Multiple R-squared: 0.3241,     Adjusted R-squared: 0.234
 #F-statistic: 3.597 on 2 and 15 DF,  p-value: 0.05297
 #

 dataf$Day-ordered(dataf$Day)

 str(dataf) # Day 1Day 2Day 3
 #'data.frame':   18 obs. of  2 variables:
 # $ y  : num  72 25 24 2 18 38 62 30 78 34 ...
 # $ Day: Ord.factor w/ 3 levels Day 1Day 2..: 1 1 1 1 1 1 2 2 2 2 ...

 summary(lm(y~Day,data=dataf)) #Significances reversed (or Day.L and
 Day.Q are not sinonimous Day 2 and Day 3?): Day 2 (.L) is
 significantly different from Day 1, but Day 3 (.Q) isn't.

 #Call:
 #lm(formula = y ~ Day, data = dataf)
 #
 #Residuals:
 #    Min      1Q  Median      3Q     Max
 #-39.833 -14.458  -3.833  13.958  42.167
 #
 #Coefficients:
 #            Estimate Std. Error t value Pr(|t|)
 #(Intercept)  48.     5.6322   8.601 3.49e-07 ***
 #Day.L        26.1630     9.7553   2.682   0.0171 *
 #Day.Q        -0.2722     9.7553  -0.028   0.9781
 #---
 #Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 #
 #Residual standard error: 23.9 on 15 degrees of freedom
 #Multiple R-squared: 0.3241,     Adjusted R-squared: 0.234
 #F-statistic: 3.597 on 2 and 15 DF,  p-value: 0.05297

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Controlling the precision of the digits printed

Thank you. I mainly didn't know about the vector/matrix printing rules. 

Kevin

-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Tuesday, November 15, 2011 10:43 AM
To: Kevin Burton; r-help@r-project.org
Subject: RE: [R] Controlling the precision of the digits printed

When you print a vector R uses a single
format for the whole vector and tries to come up with one format that
displays all the values accurately enough.  For a matrix (or data.frame) it
uses a different format for each column, so perhaps you would like the
output of:

   matrix(a, nrow=1, dimnames=list(, names(a)))
   A B C D   EF
   1e-10 1 2 3 0.5 0.25

Now you said you wanted a minimum of 4 digits after the decimal point for
large fractions like 0.25 but only 2 when using scientific notation for
small fractions like 1.0e-10 and you didn't say what you wanted for big
numbers like pi*10^10.  That rule seems complicated enough that you may want
to write your own print function based on sprintf().

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Kevin Burton
 Sent: Tuesday, November 15, 2011 8:19 AM
 To: r-help@r-project.org
 Subject: [R] Controlling the precision of the digits printed

 Has anyone come across the right combinations to print a limited 
 number of digits? My trial and error approach is taking too much time. 
 Here is what I have tried:

  op - options()

  a - c(1e-10,1,2,3,.5,.25)

  names(a) - c(A, B, C, D, E, F)

  # default

  a

   A   B   C   D   E   F

 1.0e-10 1.0e+00 2.0e+00 3.0e+00 5.0e-01 2.5e-01

  options(digits = 4, scipen=5)

  # Doesn't print exponents but there are too many trailing digits

  a

ABCDE
 F

 0.01 1.00 2.00 3.00 0.50
 0.25

  options(digits = 3, scipen=4)

  # Now we are back to exponents

  a

   A   B   C   D   E   F

 1.0e-10 1.0e+00 2.0e+00 3.0e+00 5.0e-01 2.5e-01

 I would like the integers to print as integers (1,2,3). The larger 
 fractions to print something like .5000 or .2500. And the very small 
 number to use exponents (1.0e-10)

 Is this possible?

 Thank you.

 Kevin

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Regular expressions in R

2011-11-15 Thread Michael Griffiths

Good afternoon list,

I have the following character strings; one with spaces between the maths
operators and variable names, and one without said spaces.

form-c('~ Sentence + LEGAL + Intro + Intro / Intro1 + Intro * LEGAL +
benefit + benefit / benefit1 + product + action * mean + CTA + help + mean
* product')
form-c('~Sentence+LEGAL+Intro+Intro/Intro1+Intro*LEGAL+benefit+benefit/benefit1+product+action*mean+CTA+help+mean*product')

I would like to remove the following target strings, either:

1. '+ Intro * LEGAL' which is  '+ space name space * space name'
2. '+Intro*LEGAL' which is  '+ nospace name nospace * nospace name'

Having delved into a variety of sites (e.g.
http://www.zytrax.com/tech/web/regex.htm#search) investigating regular
expressions I now have a basic grasp, but I am having difficulties removing
ALL of the instances or 1. or 2.

The code below removes just a SINGLE instance of the target string, but I
was expecting it to remove all instances as I have \\*.[[allnum]]. I did
try \\*.[[allnum]]*, but this did not work.

form-sub(\\+*\\s*[[:alnum:]]*\\s*\\*.[[:alnum:]], , form)

I am obviously still not understanding something. If the list could offer
some guidance I would be most grateful.

Regards

Mike Griffiths



-- 

*Michael Griffiths, Ph.D
*Statistician

*Upstream Systems*

8th Floor
Portland House
Bressenden Place
SW1E 5BH

http://www.google.com/url?q=http%3A%2F%2Fwww.upstreamsystems.com%2Fsa=Dsntz=1usg=AFrqEzfKYfaAalqvahwrpywpJDL9DxUmWw

Tel   +44 (0) 20 7869 5147
Fax  +44 207 290 1321
Mob +44 789 4944 145

www.upstreamsystems.comhttp://www.google.com/url?q=http%3A%2F%2Fwww.upstreamsystems.com%2Fsa=Dsntz=1usg=AFrqEzfKYfaAalqvahwrpywpJDL9DxUmWw

*griffi...@upstreamsystems.com einst...@upstreamsystems.com*

http://www.upstreamsystems.com/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regular expressions in R

Hi Michael,

You need to take another look at the examples you were given, and at
the help for ?sub():

The two ‘*sub’ functions differ only in that ‘sub’ replaces only
the first occurrence of a ‘pattern’ whereas ‘gsub’ replaces all
occurrences. If ‘replacement’ contains backreferences which are
not defined in ‘pattern’ the result is undefined (but most often
the backreference is taken to be ‘’).

Sarah

On Tue, Nov 15, 2011 at 12:18 PM, Michael Griffiths
griffi...@upstreamsystems.com wrote:
Good afternoon list,

I have the following character strings; one with spaces between the maths
operators and variable names, and one without said spaces.

form-c('~ Sentence + LEGAL + Intro + Intro / Intro1 + Intro * LEGAL +
benefit + benefit / benefit1 + product + action * mean + CTA + help + mean
* product')
form-c('~Sentence+LEGAL+Intro+Intro/Intro1+Intro*LEGAL+benefit+benefit/benefit1+product+action*mean+CTA+help+mean*product')

I would like to remove the following target strings, either:

1. '+ Intro * LEGAL' which is '+ space name space * space name'
2. '+Intro*LEGAL' which is '+ nospace name nospace * nospace name'

Having delved into a variety of sites (e.g.
http://www.zytrax.com/tech/web/regex.htm#search) investigating regular
expressions I now have a basic grasp, but I am having difficulties removing
ALL of the instances or 1. or 2.

The code below removes just a SINGLE instance of the target string, but I
was expecting it to remove all instances as I have \\*.[[allnum]]. I did
try \\*.[[allnum]]*, but this did not work.

form-sub(\\+*\\s*[[:alnum:]]*\\s*\\*.[[:alnum:]], , form)

I am obviously still not understanding something. If the list could offer
some guidance I would be most grateful.

Regards

Mike Griffiths

--
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] mlpowsim

I'm using Bill Browne's MLPowSim to do some sample size estimation for a 
multilevel model.  It creates an R program to carry out the estimation using 
lmer in the lme4 library.  When there are predictors with more than two 
categories one has to modify the code generated to account for the multinomial 
nature of the predictor.
 
Browne makes the following warning in his documetation based on one of his 
examples: Since the probability of choosing a boys’ school is low, we may have 
all zeroes in the first row of the generated multinomial variable: i.e. no 
boys’ schools in n2 schools generated. Consequently, the whole of the third 
column of the design matrix for the fixed parameters, X, would then be zero. In 
such instances it would not be possible to estimate the parameters, and 
attempting to fit this model would lead to an error message in R. This was in 
reference to a three category predictor with probabilities of .15, .30 and .55.
 
Browne points out that the solution is for the associated fixed parameters to 
be set to zero when there is an entire column of zeroes.  Since his 
documentation is several years old, I'm wondering if the multilevel package in 
R will now properly set fixed effects in such cases to zero or if the problem 
remains?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Regular expressions in R

Hi Michael,

Your strings were long so I made a bit smaller example.  Sarah made
one good point, you want to be using gsub() not sub(), but when I use
your code, I do not think it even works precisely for one instance.
Try this on for size, you were 99% there:

## simplified cases
form1 - c('product + action * mean + CTA + help + mean * product')
form2 - c('product+action*mean+CTA+help+mean*product')

## what I believe your desired output is
'product + CTA + help'
'product+CTA+help'

gsub(\\s\\+\\s[[:alnum:]]*\\s\\*\\s[[:alnum:]]*, , form1)
gsub(\\+[[:alnum:]]*\\*[[:alnum:]]*, , form2)

## your code (using gsub() instead of sub())
gsub(\\+*\\s*[[:alnum:]]*\\s*\\*.[[:alnum:]], , form1)


 Running on r57586 Windows x64 
 gsub(\\s\\+\\s[[:alnum:]]*\\s\\*\\s[[:alnum:]]*, , form1)
[1] product + CTA + help
 gsub(\\+[[:alnum:]]*\\*[[:alnum:]]*, , form2)
[1] product+CTA+help

 ## your code (using gsub() instead of sub())
 gsub(\\+*\\s*[[:alnum:]]*\\s*\\*.[[:alnum:]], , form1)
[1] product ean + CTA + help roduct

Hope this helps,

Josh

On Tue, Nov 15, 2011 at 9:18 AM, Michael Griffiths
griffi...@upstreamsystems.com wrote:
 Good afternoon list,

 I have the following character strings; one with spaces between the maths
 operators and variable names, and one without said spaces.

 form-c('~ Sentence + LEGAL + Intro + Intro / Intro1 + Intro * LEGAL +
 benefit + benefit / benefit1 + product + action * mean + CTA + help + mean
 * product')
 form-c('~Sentence+LEGAL+Intro+Intro/Intro1+Intro*LEGAL+benefit+benefit/benefit1+product+action*mean+CTA+help+mean*product')

 I would like to remove the following target strings, either:

 1. '+ Intro * LEGAL' which is  '+ space name space * space name'
 2. '+Intro*LEGAL' which is  '+ nospace name nospace * nospace name'

 Having delved into a variety of sites (e.g.
 http://www.zytrax.com/tech/web/regex.htm#search) investigating regular
 expressions I now have a basic grasp, but I am having difficulties removing
 ALL of the instances or 1. or 2.

 The code below removes just a SINGLE instance of the target string, but I
 was expecting it to remove all instances as I have \\*.[[allnum]]. I did
 try \\*.[[allnum]]*, but this did not work.

 form-sub(\\+*\\s*[[:alnum:]]*\\s*\\*.[[:alnum:]], , form)

 I am obviously still not understanding something. If the list could offer
 some guidance I would be most grateful.

 Regards

 Mike Griffiths



 --

 *Michael Griffiths, Ph.D
 *Statistician

 *Upstream Systems*

 8th Floor
 Portland House
 Bressenden Place
 SW1E 5BH

 http://www.google.com/url?q=http%3A%2F%2Fwww.upstreamsystems.com%2Fsa=Dsntz=1usg=AFrqEzfKYfaAalqvahwrpywpJDL9DxUmWw

 Tel   +44 (0) 20 7869 5147
 Fax  +44 207 290 1321
 Mob +44 789 4944 145

 www.upstreamsystems.comhttp://www.google.com/url?q=http%3A%2F%2Fwww.upstreamsystems.com%2Fsa=Dsntz=1usg=AFrqEzfKYfaAalqvahwrpywpJDL9DxUmWw

 *griffi...@upstreamsystems.com einst...@upstreamsystems.com*

 http://www.upstreamsystems.com/

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Extract pattern from string

2011-11-15 Thread syrvn

Hello,

with Sys.time() you get the following string:

2011-11-15 16:25:55 GMT

How can I extract the following substrings:

year - 2011

month - 11

day_time - 15_16_25_55


Cheers,

Syrvn

--
View this message in context: 
http://r.789695.n4.nabble.com/Extract-pattern-from-string-tp4073432p4073432.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract pattern from string

2011-11-15 Thread Justin Haynes

take a look at the structure of what Sys.time returns.

str(Sys.time)

and now at ?strptime!

 format(Sys.time(),format='%d-%H-%M-%S')
[1] 15-09-55-55

 format(Sys.time(),format='%Y')
[1] 2011
 format(Sys.time(),format='%m')
[1] 11



Hope that helps,

Justin

On Tue, Nov 15, 2011 at 9:48 AM, syrvn ment...@gmx.net wrote:
 Hello,

 with Sys.time() you get the following string:

 2011-11-15 16:25:55 GMT

 How can I extract the following substrings:

 year - 2011

 month - 11

 day_time - 15_16_25_55


 Cheers,

 Syrvn

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Extract-pattern-from-string-tp4073432p4073432.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Putting directory path as a parameter

2011-11-15 Thread aajit75

Hi List,

I am new to R, this may be simple. 

I want to store directory path as parameter which in turn to be used while
reading and writing data from csv files.

How I can use dir defined  in the below mentioned example while reading the
csv file.

Example:
dir - C:/Users/Desktop #location of file

temp_data - read.csv(dir/bs_dev_segment_file.csv)

If I run this it will show errors:

Error in file(file, rt) : cannot open the connection
In addition: Warning message:
In file(file, rt) :
  cannot open file 'dir/bs_dev_segment_file.csv': No such file or directory

Regards,
-Ajit

--
View this message in context: 
http://r.789695.n4.nabble.com/Putting-directory-path-as-a-parameter-tp4043092p4043092.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help

2011-11-15 Thread Gyanendra Pokharel

Hi all,
I have the mean vector mu- c(0,0) and variance sigma - c(10,10), now how
to sample from the bivariate normal density in R?
Can some one suggest me?
I did not fine the function mvdnorm in R.
Best
Gyan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] correlations between columns for each row

2011-11-15 Thread robgriffin247

Just as an update on this problem:
I have managed to get the variance for the selected columns

Now all I need is the covariance between these 2 selections - 
the two target columns are and the aim is that a new column contain a
covariance value between these on each row: 

maindata[,c(174:213)] and maindata[,c(214:253]

I've played around with all sorts of apply (and derivatives of apply) and in
various different setups so I think I'm close but I feel like I'm chasing my
tail here! 

--
View this message in context: 
http://r.789695.n4.nabble.com/correlations-between-columns-for-each-row-tp4039193p4073208.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Putting directory path as a parameter

2011-11-15 Thread R. Michael Weylandt michael.weyla...@gmail.com

Hi Aajit,

try using the ?paste function to combine the variable with your
directly and the filename into one string, and then pass that to
read.csv() or whatever

paste(dir, /bs_dev_segment_file.csv, sep = '')

HTH,

Josh

On Tue, Nov 15, 2011 at 6:12 AM, aajit75 aaji...@yahoo.co.in wrote:
 Hi List,

 I am new to R, this may be simple.

 I want to store directory path as parameter which in turn to be used while
 reading and writing data from csv files.

 How I can use dir defined  in the below mentioned example while reading the
 csv file.

 Example:
 dir - C:/Users/Desktop #location of file

 temp_data - read.csv(dir/bs_dev_segment_file.csv)

 If I run this it will show errors:

 Error in file(file, rt) : cannot open the connection
 In addition: Warning message:
 In file(file, rt) :
  cannot open file 'dir/bs_dev_segment_file.csv': No such file or directory

 Regards,
 -Ajit

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Putting-directory-path-as-a-parameter-tp4043092p4043092.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Putting directory path as a parameter

Try pasting them together like

paste(dir, ...)

You may need to use the collapse argument. Alternatively, change your working 
directory to dir with setwd(). 

M

On Nov 15, 2011, at 9:12 AM, aajit75 aaji...@yahoo.co.in wrote:

 Hi List,
 
 I am new to R, this may be simple. 
 
 I want to store directory path as parameter which in turn to be used while
 reading and writing data from csv files.
 
 How I can use dir defined  in the below mentioned example while reading the
 csv file.
 
 Example:
 dir - C:/Users/Desktop #location of file
 
 temp_data - read.csv(dir/bs_dev_segment_file.csv)
 
 If I run this it will show errors:
 
 Error in file(file, rt) : cannot open the connection
 In addition: Warning message:
 In file(file, rt) :
  cannot open file 'dir/bs_dev_segment_file.csv': No such file or directory
 
 Regards,
 -Ajit
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Putting-directory-path-as-a-parameter-tp4043092p4043092.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] correlations between columns for each row

Hi Rob,

Here is one approach:


## define a function that does the calculations
## (the covariance of two vectors divided by the square root of
## the products of their variances is just a correlation)
rF - function(x, a, b) cor(x[a], x[b], use = complete.obs)

set.seed(1)
bigdata - matrix(rnorm(271 * 13890), ncol = 271)

results - apply(bigdata, 1, FUN = rF, a = 174:213, b = 214:253)

## combine
bigdata - cbind(bigdata, iecorr = results)

Hope this helps,

Josh

On Tue, Nov 15, 2011 at 8:42 AM, robgriffin247
robgriffin...@hotmail.com wrote:
 Just as an update on this problem:
 I have managed to get the variance for the selected columns

 Now all I need is the covariance between these 2 selections -
 the two target columns are and the aim is that a new column contain a
 covariance value between these on each row:

 maindata[,c(174:213)] and maindata[,c(214:253]

 I've played around with all sorts of apply (and derivatives of apply) and in
 various different setups so I think I'm close but I feel like I'm chasing my
 tail here!

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/correlations-between-columns-for-each-row-tp4039193p4073208.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help

2011-11-15 Thread R. Michael Weylandt michael.weyla...@gmail.com



On Nov 15, 2011, at 11:21 AM, Gyanendra Pokharel wrote:


Hi all,
I have the mean vector mu- c(0,0) and variance sigma - c(10,10),  
now how

to sample from the bivariate normal density in R?
Can some one suggest me?
I did not fine the function mvdnorm in R.


But when you typed ?mvdnorm R should have returned message to type ?? 
mvdnorm. I get a tone of on target hits when I do that.


(... and I think you may not get as many hits because I have a bunch  
of package installed but there was a hit from MASS which I think is  
installed by default )


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Putting directory path as a parameter

2011-11-15 Thread Henrik Bengtsson

file.path() is much better for this than paste(), e.g.

dir - C:/Users/Desktop
pathname - file.path(dir, bs_dev_segment_file.csv)
temp_data - read.csv(pathname)

/Henrik

On Tue, Nov 15, 2011 at 10:08 AM, R. Michael Weylandt
michael.weyla...@gmail.com michael.weyla...@gmail.com wrote:
 Try pasting them together like

 paste(dir, ...)

 You may need to use the collapse argument. Alternatively, change your working 
 directory to dir with setwd().

 M

 On Nov 15, 2011, at 9:12 AM, aajit75 aaji...@yahoo.co.in wrote:

 Hi List,

 I am new to R, this may be simple.

 I want to store directory path as parameter which in turn to be used while
 reading and writing data from csv files.

 How I can use dir defined  in the below mentioned example while reading the
 csv file.

 Example:
 dir - C:/Users/Desktop #location of file

 temp_data - read.csv(dir/bs_dev_segment_file.csv)

 If I run this it will show errors:

 Error in file(file, rt) : cannot open the connection
 In addition: Warning message:
 In file(file, rt) :
  cannot open file 'dir/bs_dev_segment_file.csv': No such file or directory

 Regards,
 -Ajit

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Putting-directory-path-as-a-parameter-tp4043092p4043092.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] points() colored by value

2011-11-15 Thread Chris82

Hi R users,

I want to colored points by their value

for example:

x - c(1,2,3,4)
y - c(1,2,3,4)
z - c(2,3,4,9)

y and x are coordinates

z is the value of the coordinates

points(x,y,col= rainbow(z))

something like that

But haven't found any solution at the moment.

Thanks.

Chris



--
View this message in context: 
http://r.789695.n4.nabble.com/points-colored-by-value-tp4073640p4073640.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lower bounds on selfStart function not working

2011-11-15 Thread Schatzi

I was able to solve this problem by going back to nls and obtaining the
initial parameter estimates through optim. When I used nlsList with my
dataset, it took 2 minutes to solve and was not limited by the bounds. Now I
have the bounds working and it takes 45 seconds to solve. Here is the new
code:

A-1.75 
mu-.2 
l-2 
b-0 
x-seq(0,18,.25) 
create.y-function(x){ 
y-b+A/(1+exp(4*mu/A*(l-x)+2)) 
return(y) 
} 
ys-create.y(x) 
yvec-(rep(ys,5))*(.9+runif(length(x)*5)/5) 
Trt-factor(c(rep(A1,length(x)),rep(A2,length(x)),rep(A3,length(x)),rep(A4,length(x)),rep(A5,length(x
 
Data-data.frame(Trt,rep(x,5),yvec) 
names(Data)-c(Trt,x,y) 

NewData-groupedData(y~x|Trt,data=Data) 

ids-levels(factor(xy$Trt))
output-matrix(0,length(ids),4)

modeltest- function(A,mu,l,b,x){
out-vector(length=length(x))
for (i in 1:length(x)) {
out[i]-b+A/(1+exp(4*mu/A*(l-x[i])+2))
}
return(out)
}

lower.bound-list(A=.01,mu=0,l=0,b=0)

for (i in 1:length(ids)){
xy-subset(NewData,Trt==ids[i])
y-xy$y
x-xy$x


   A.s  - max(xy$y)-min(xy$y) 
   mu.s  - A.s/7.5 
   l.s - 0 
   b.s- max(min(xy$y),0.1) 
   value  - c(A.s, l.s, mu.s, b.s) 

#function to optimize 
func1 - function(value) { 
   A.s  - value[1] 
   mu.s  - value[2] 
   l.s - value[3] 
   b.s- value[4] 
  y1-rep(0,length(xy$x)) # generate vector for predicted y (y1) to evaluate
against observed y 
  for(cnt in 1:length(xy$x)){ 
  y1[cnt]- b.s+A.s/(1+exp(4*mu.s/A.s*(l.s-x[cnt])+2))}  #predicting y1 for
values of y 
  evl-sum((xy$y-y1)^2) #sum of squares is function to minimize 
  return(evl)} 

#optimizing 
oppar-optim(c(A.s , mu.s , l.s , b.s),func1,method=L-BFGS-B, 
lower=c(0.0001,0.0,0.0,0.0), 
control=list(maxit=2000)) 

#saving optimized parameters 
value-c(oppar$par[1L],oppar$par[2L],oppar$par[3L],oppar$par[4L]) 

   names(value) - c(A,mu,l,b)

try(nmodel-nls(y~modeltest(A,mu,l,b,x),data=xy,
start=value,
lower=lower.bound,
algorithm=port)
)

coefv-coef(nmodel)
output[i,]-coefv
   
} 



-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/Lower-bounds-on-selfStart-function-not-working-tp3999231p4073639.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Estimating model parameters for system of equations

2011-11-15 Thread lstevenson

Hi all,

I'm trying to estimate model parameters in R for a pretty simple system of
equations, but I'm having trouble.  Here is the system of equations (all
derivatives):
eqAlgae - (u_Amax * C_A) * (1 - (Q_Amin / Q_A))
eqQuota - (p_max * R_V) / (K_p + R_V) - ((Q_A-Q_Amin)*u_Amax)
eqResource - -C_A * (p_max * R_V) / (K_p + R_V)
eqSystem - list(C_A = eqAlgae, Q_A = eqQuota, R_V = eqResource)

I want to estimate u_Amax, Q_Amin, p_max and Q_Amin with the data I've
collected using least squares. I've tried using systemfit but I'm not sure
how to write out the equations (my attempt is above but that doesn't work
since I haven't given values to the parameters I'm trying to estimate -
should I give those parameters initial values?). I've looked into the other
functions to get least squares estimates (e.g. lm() ) but I'm not sure how
to use that for a system of equations. I have some experience with R but I'm
a novice when it comes to parameter estimation, so any help would be much
appreciated! Thank you!

--
View this message in context: 
http://r.789695.n4.nabble.com/Estimating-model-parameters-for-system-of-equations-tp4073490p4073490.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Estimating model parameters for system of equations

2011-11-15 Thread Arne Henningsen

Dear Louise

On 15 November 2011 19:03, lstevenson louise.steven...@lifesci.ucsb.edu wrote:
 Hi all,

 I'm trying to estimate model parameters in R for a pretty simple system of
 equations, but I'm having trouble.  Here is the system of equations (all
 derivatives):
 eqAlgae - (u_Amax * C_A) * (1 - (Q_Amin / Q_A))
 eqQuota - (p_max * R_V) / (K_p + R_V) - ((Q_A-Q_Amin)*u_Amax)
 eqResource - -C_A * (p_max * R_V) / (K_p + R_V)
 eqSystem - list(C_A = eqAlgae, Q_A = eqQuota, R_V = eqResource)

 I want to estimate u_Amax, Q_Amin, p_max and Q_Amin with the data I've
 collected using least squares. I've tried using systemfit but I'm not sure
 how to write out the equations (my attempt is above but that doesn't work
 since I haven't given values to the parameters I'm trying to estimate -
 should I give those parameters initial values?). I've looked into the other
 functions to get least squares estimates (e.g. lm() ) but I'm not sure how
 to use that for a system of equations. I have some experience with R but I'm
 a novice when it comes to parameter estimation, so any help would be much
 appreciated! Thank you!

Your system of equations is non-linear in parameters. As lm() and
systemfit() can only estimate models that are linear in parameters,
you cannot use these commands to estimate your model. The systemfit
package includes the function nlsystemfit() that is intended to
estimate systems of non-linear equations. However, nlsystemfit() is
still under development and often has convergence problems. Therefore,
I wouldn't use it for serious applications. You can estimate your
non-linear equations separately with nls(). If you want to estimate
your equations jointly, I am afraid that you either have to switch to
another software or have to implement the estimation yourself. You
could, e.g., minimize the determinant of the residual covariance
matrix with optim(), nlm(), nlminb(), or another optimizer or you
could maximize the likelihood function of the FIML model using
maxLik(). Sorry that I (and R) cannot present you a simple solution!

Best wishes from Copenhagen,
Arne

-- 
Arne Henningsen
http://www.arne-henningsen.name

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help

I'm not sure your request completely makes sense: marginal means and variances 
are not sufficient to give the joint distribution; even if you can be assured 
it is bivariate normal, you still need a correlation. Just a heads up

Michael. 

PS - next time would you please use a slightly more nuanced subject line for 
the archive? Thanks. 

On Nov 15, 2011, at 1:31 PM, David Winsemius dwinsem...@comcast.net wrote:

 
 On Nov 15, 2011, at 11:21 AM, Gyanendra Pokharel wrote:
 
 Hi all,
 I have the mean vector mu- c(0,0) and variance sigma - c(10,10), now how
 to sample from the bivariate normal density in R?
 Can some one suggest me?
 I did not fine the function mvdnorm in R.
 
 But when you typed ?mvdnorm R should have returned message to type ??mvdnorm. 
 I get a tone of on target hits when I do that.
 
 (... and I think you may not get as many hits because I have a bunch of 
 package installed but there was a hit from MASS which I think is installed by 
 default )
 
 -- 
 
 David Winsemius, MD
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] points() colored by value

2011-11-15 Thread R. Michael Weylandt michael.weyla...@gmail.com

Try either col=z or col=rainbow(max(z))[z] depending on what color scheme you 
want. 

Michael 

On Nov 15, 2011, at 1:47 PM, Chris82 rubenba...@gmx.de wrote:

 Hi R users,
 
 I want to colored points by their value
 
 for example:
 
 x - c(1,2,3,4)
 y - c(1,2,3,4)
 z - c(2,3,4,9)
 
 y and x are coordinates
 
 z is the value of the coordinates
 
 points(x,y,col= rainbow(z))
 
 something like that
 
 But haven't found any solution at the moment.
 
 Thanks.
 
 Chris
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/points-colored-by-value-tp4073640p4073640.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] points() colored by value

Hi Chris,

On Tue, Nov 15, 2011 at 1:47 PM, Chris82 rubenba...@gmx.de wrote:
 Hi R users,

 I want to colored points by their value

 for example:

 x - c(1,2,3,4)
 y - c(1,2,3,4)
 z - c(2,3,4,9)

 y and x are coordinates

 z is the value of the coordinates

 points(x,y,col= rainbow(z))

In the general sense:
plot(x, y, col=rainbow(length(unique(z)))[as.factor(z)])

converting z to a factor to use as an index is just a quick way to
convert z to sequential values 1,2,3,4 rather than 2,3,4,9 and to
ensure that multiple and unsorted values use the correct color. If z
contains only sequential values, that bit is unnecessary.

I like RColorBrewer for things like this, rather than rainbow, but it
depends on what you're trying to do.

Sarah

 something like that

 But haven't found any solution at the moment.

 Thanks.

 Chris



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with error: no acceptable C compiler found in $PATH

2011-11-15 Thread Hari Easwaran

Dear all,
I am trying to install a package from bioconductor (biomaRt) for which I
need the RCurl package. I get the following main error message when I try
to install RCurl (and its dependencies).

configure: error: no acceptable C compiler found in $PATH
See `config.log' for more details.
ERROR: configuration failed for package RCurl

I searched for possible solutions and read in some online mailing list that
I might have to install Xcode to install the gcc compiler. I am not sure if
I should do this because I have installed RCurl in previous versions of R
without any problems (on this same computer). I upgraded to the latest R (R
version 2.14.0) and faced this problem. So I downgraded to R version 2.13.2
and still cannot install RCurl. I think my last successful installation of
RCurl was with R version 2.11.

Following is the complete error message and my R version details.
I really appreciate any help or suggestions.

Sincerely,
Hari

trying URL '
http://watson.nci.nih.gov/cran_mirror/src/contrib/XML_3.4-3.tar.gz'
Content type 'application/octet-stream' length 906364 bytes (885 Kb)
opened URL
==
downloaded 885 Kb

trying URL '
http://watson.nci.nih.gov/cran_mirror/src/contrib/RCurl_1.7-0.tar.gz'
Content type 'application/octet-stream' length 813252 bytes (794 Kb)
opened URL
==
downloaded 794 Kb

* installing *source* package XML ...
checking for gcc... no
checking for cc... no
checking for cl.exe... no
configure: error: no acceptable C compiler found in $PATH
See `config.log' for more details.
ERROR: configuration failed for package XML
* removing
/Library/Frameworks/R.framework/Versions/2.13/Resources/library/XML
* installing *source* package RCurl ...
checking for curl-config... /usr/bin/curl-config
checking for gcc... no
checking for cc... no
checking for cc... no
checking for cl... no
configure: error: no acceptable C compiler found in $PATH
See `config.log' for more details.
ERROR: configuration failed for package RCurl
* removing
/Library/Frameworks/R.framework/Versions/2.13/Resources/library/RCurl
* restoring previous
/Library/Frameworks/R.framework/Versions/2.13/Resources/library/RCurl

The downloaded packages are in
/private/var/folders/a6/a60JdPfrHC0ZAizZWyNM-E+++TI/-Tmp-/RtmpVjBcvX/downloaded_packages

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ANNOUNCEMENT: 20% discount on the most recent R books from Chapman Hall/CRC!

2011-11-15 Thread Thorne, Melanie

Take advantage of a 20% discount on the most recent R books from Chapman  
Hall/CRC! 

 

We are pleased to offer our latest R books at a 20% discount through our 
website. To take advantage of this offer, simply visit www.crcpress.com, choose 
your titles and insert code AZL02 in the 'Promotion Code' field at checkout. 
Standard Shipping is always FREE on all orders from CRCPress.com!

 

 

R Graphics, Second Edition 

Paul Murrell, The University of Auckland, New Zealand

 

ISBN:  9781439831762

Publication Date:  June 2011 

Number of Pages:  546

Extensively updated to reflect the evolution of statistics and computing, the 
second edition of the bestselling R Graphics comes complete with new packages 
and new examples. Paul Murrell, widely known as the leading expert on R 
graphics, has developed an in-depth resource that helps both neophyte and 
seasoned users master the intricacies of R graphics.

Discounted Price: $63.96 / £39.96

 

For more details and to order: 
http://www.crcpress.com/product/isbn/9781439831762  

 

*

 

Statistical Computing in C++ and R 

Randall L. Eubank, Arizona State University, Tempe, USA; Ana Kupresanin, 
Lawrence Livermore National Laboratory, California, USA

ISBN:  9781420066500 

Publication Date:  December 2011 

Number of Pages:  556

Parallel processing can be ideally suited for the solving of more complex 
problems in statistical computing. This book discusses code development in C++ 
and R, before going beyond to look at the valuable use of these two languages 
in unison. It covers linear equation solution with regression and linear models 
motivation, optimization with maximum likelihood and nonlinear least squares 
motivation, and random number generation.

Discounted Price: $71.96 / £46.39

 

For more details and to order: 
http://www.crcpress.com/product/isbn/9781420066500  

 



 

The R Primer 

Claus Thorn Ekstrom, University of Copenhagen, Frederiksberg, Denmark

 

ISBN:  9781439862063 

Publication Date:  August 2011 

Number of Pages:  299

Newcomers to R are often intimidated by the command-line interface, the vast 
number of functions and packages, or the processes of importing data and 
performing a simple statistical analysis. The R Primer provides a collection of 
concise examples and solutions to R problems frequently encountered by new 
users of this statistical software.

Discounted Price: $31.96 / £20.79

For more details and to order: 
http://www.crcpress.com/product/isbn/9781439862063  

 



 

Statistics and Data Analysis for Microarrays Using R and Bioconductor, Second 
Edition 

Sorin Draghici, Wayne State University, Detroit, Michigan, USA

 

ISBN:  9781439809754 

Publication Date:  November 2011 

Number of Pages:  1,036

 

Richly illustrated in color, Statistics and Data Analysis for Microarrays Using 
R and Bioconductor, Second Edition provides a clear and rigorous description of 
powerful analysis techniques and algorithms for mining and interpreting 
biological information.

 

Discounted Price: $71.96 / £46.39

For more details and to order: 
http://www.crcpress.com/product/isbn/9781439809754   

 



 

An R Companion to Linear Statistical Models 

Christopher Hay-Jahans

 

ISBN:  9781439873656  

Publication Date:  October 2011 

Number of Pages:  372

Focusing on user-developed programming, An R Companion to Linear Statistical 
Models serves two audiences: Those who are familiar with the theory and 
applications of linear statistical models and wish to learn or enhance their 
skills in R; and those who are enrolled in an R-based course on regression and 
analysis of variance. For those who have never used R, the book begins with a 
self-contained introduction to R that lays the foundation for later chapters.

Discounted Price: $63.96 / £39.99

For more details and to order: 
http://www.crcpress.com/product/isbn/9781439873656   

 



 

Analysis of Questionnaire Data with R 

Bruno Falissard, INSERM U669, Paris, France

 

ISBN:  9781439817667

Publication Date:  September 2011 

Number of Pages:  280

 

Analysis of Questionnaire Data with R translates certain classic research 
questions into statistical formulations. As indicated in the title, the syntax 
of these statistical formulations is based on the well-known R language, chosen 
for its popularity, simplicity, and power of its structure.

 

Discounted Price: $71.96 / £46.39

For more details and to order: 
http://www.crcpress.com/product/isbn/9781439817667  

 



 

Click here to view our latest Statistics catalog! 
http://issuu.com/crcpress/docs/probability_statistics_mbcsig1_ms 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between two time series

2011-11-15 Thread Sarwarul Chy

Hello,

Can you please help me with this? I am also stack in the same problem.

Sam 

--
View this message in context: 
http://r.789695.n4.nabble.com/Difference-between-two-time-series-tp819843p4073800.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plot alignment with mtext

I would like the text plotted with 'mtext' to be alighned like it is for
printing on the console. Here is what I have:

 

 print(emt)

ME   RMSE  MAE
MPE   MAPE MASE

original -1.034568e+07 1.097695e+08 2.433160e+07 -31.30554   37.47713
1.5100050

xreg1.561235e+01 2.008599e+03 9.089473e+02 267.05490 280.66734
0.9893643

 

 a - capture.output(print(emt))

 a

[1] ME RMSE  MAE   MPE  MAPE
MASE

[2] original -1.034568e+07 1.097695e+08 2.433160e+07 -31.30554  37.47713
1.5100050

[3] xreg  1.561235e+01 2.008599e+03 9.089473e+02 267.05490 280.66734
0.9893643

 

There are no tabs but when adding to a plot with mtext like:

 

op - par(mfcol = c(2, 1), oma=c(0,0,4,0))

. . . . .

a - capture.output(print(emt))

mtext(a[1], line= 1, side=3, outer=TRUE)

mtext(a[2], line= 0, side=3, outer=TRUE)

mtext(a[3], line=-1, side=3, outer=TRUE)

 

The plotted text is not aligned like when it is displayed on the console. I
have looked at the strings and they all have the same length so it seems
that mtext is doing something with the spaces so that the output is not
aligned. Any ideas on how I can get it aligned (by column)?

 

Thank you.

 

Kevin


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove names and more from list with capture.output()

2011-11-15 Thread Sverre Stausland

Thanks David - this is pretty close to what I am looking for. However,
the output of vec[2] now includes the row number [1] and quotations
marks at the endpoints of the row. Is there an easy way to exclude
those?

Thanks
Sverre

On Tue, Nov 15, 2011 at 8:11 AM, David Winsemius dwinsem...@comcast.net wrote:

 On Nov 14, 2011, at 11:49 PM, Sverre Stausland wrote:

 Hi R users,

 I end up with a list object after running an anova:

 lm(speed ~ 1 + dist + speed:dist, data = cars) - Int
 lm(speed ~ 1 + dist, data = cars) - NoInt
 anova(Int, NoInt) - test
 test - test[c(Df, F, Pr(F))][2,]
 is.list(test)

 [1] TRUE

 test

  Df      F    Pr(F)
 2 -1 18.512 8.481e-05 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 I would like to use capture.output() when writing this information to
 a text file, but I would like to only print the row, not the names (Df
    F    Pr(F)), and not the significance codes. That is, I want the
 text printed to my text file to be:

 2 -1 18.512 8.481e-05 ***

 the output of capture.output is just going to be a character vector and you
 should select the second element.

 vec - capture.output(test)
 vec[2]
 [1] 2 -1 18.512 8.481e-05 ***

 Is there a way to do this?

 Thanks
 Sverre

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Estimating model parameters for system of equations

2011-11-15 Thread David Stevens

This problem is not for the faint of heart. Doug Bates, author of 
nls(...) has said that a general purpose implementaion of R code for 
multiresponse nonlinear regression is unlikely in the near future. You 
have a large set of issues to deal with here. First, you have a system 
of differential equations that are nonlinear so they need to be solved 
precisely.  This is your starting point.  Next you have three responses 
- Arne's comment about minimizing the determinant of the covariance 
model is a standard approach but not universal - it depends on the error 
structure of the data. Third, optimization routines (there are several 
in R) can be used but they are often persnickity (sp?) when used with 
numerical solutions to the differential equations. I've used nlm and 
optim both for multiresponse problems in R but with explicit solutions 
rather than numerical methods. Fourth, there may be issues with 
correlation among responses that can render the residual covariance 
matrix (near) singular and the determinant can vanish. To get started I'd

1. Ask myself if simultaneous estimation is what you really want to do 
or if you can get what you want from single response estimation using 
nls(...). If yes, then
2. Read the book by Bates and Watts (there are others, but this one is 
very concise and has examples). It's a John Wiley book called Nonlinear 
regression analysis and its applications, 1988. It's very likely in the 
UCSB library.
3. Start by estimating parameters for each response individually to see 
if the model can fit the data. If it can't, you need to reformulate your 
model and go back to 1.  If so, then there's hope and proceed to trying 
to use two responses, then three.  Whenever you have multiple responses 
you need to do the eigenvalue/eigenvector analysis described in that 
book and several previous papers by George Box and colleagues.
4. Finally, if all goes well, calculate the Bayesian 95% joint 
confidence regions for the parameter pairs to assess their uncertainty 
and check the model residuals for compliance with normality, 
independence, and constant variance.
5. Collect Nobel prize!

This is one approach I've used with success. There are others, as hinted 
at by Arne. Good luck - keep us posted on how it goes.

Regards

David Stevens

On 11/15/2011 12:17 PM, Arne Henningsen wrote:
 Dear Louise

 On 15 November 2011 19:03, lstevensonlouise.steven...@lifesci.ucsb.edu  
 wrote:
 Hi all,

 I'm trying to estimate model parameters in R for a pretty simple system of
 equations, but I'm having trouble.  Here is the system of equations (all
 derivatives):
 eqAlgae- (u_Amax * C_A) * (1 - (Q_Amin / Q_A))
 eqQuota- (p_max * R_V) / (K_p + R_V) - ((Q_A-Q_Amin)*u_Amax)
 eqResource- -C_A * (p_max * R_V) / (K_p + R_V)
 eqSystem- list(C_A = eqAlgae, Q_A = eqQuota, R_V = eqResource)

 I want to estimate u_Amax, Q_Amin, p_max and Q_Amin with the data I've
 collected using least squares. I've tried using systemfit but I'm not sure
 how to write out the equations (my attempt is above but that doesn't work
 since I haven't given values to the parameters I'm trying to estimate -
 should I give those parameters initial values?). I've looked into the other
 functions to get least squares estimates (e.g. lm() ) but I'm not sure how
 to use that for a system of equations. I have some experience with R but I'm
 a novice when it comes to parameter estimation, so any help would be much
 appreciated! Thank you!
 Your system of equations is non-linear in parameters. As lm() and
 systemfit() can only estimate models that are linear in parameters,
 you cannot use these commands to estimate your model. The systemfit
 package includes the function nlsystemfit() that is intended to
 estimate systems of non-linear equations. However, nlsystemfit() is
 still under development and often has convergence problems. Therefore,
 I wouldn't use it for serious applications. You can estimate your
 non-linear equations separately with nls(). If you want to estimate
 your equations jointly, I am afraid that you either have to switch to
 another software or have to implement the estimation yourself. You
 could, e.g., minimize the determinant of the residual covariance
 matrix with optim(), nlm(), nlminb(), or another optimizer or you
 could maximize the likelihood function of the FIML model using
 maxLik(). Sorry that I (and R) cannot present you a simple solution!

 Best wishes from Copenhagen,
 Arne


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] package installtion

I'm getting the following error in a script: Error: could not find function 
lmer.    I'm wondering of my lme4 package is installed incorrectly.  Can 
someone tell me the installation procedure?  I looked at the support docs but 
couldn't translate that into anything that would work.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot alignment with mtext

Hi Kevin,

On Tue, Nov 15, 2011 at 2:36 PM, Kevin Burton rkevinbur...@charter.net wrote:
 I would like the text plotted with 'mtext' to be alighned like it is for
 printing on the console. Here is what I have:

You don't provide any of the info in the posting guide (OS may be
important here), or a reproducible example, which would also be
helpful.

But see below anyway.



 print(emt)

                    ME                   RMSE                  MAE
 MPE           MAPE         MASE

 original -1.034568e+07 1.097695e+08 2.433160e+07 -31.30554   37.47713
 1.5100050

 xreg        1.561235e+01 2.008599e+03 9.089473e+02 267.05490 280.66734
 0.9893643



 a - capture.output(print(emt))

 a

 [1]                     ME         RMSE          MAE       MPE      MAPE
 MASE

 [2] original -1.034568e+07 1.097695e+08 2.433160e+07 -31.30554  37.47713
 1.5100050

 [3] xreg      1.561235e+01 2.008599e+03 9.089473e+02 267.05490 280.66734
 0.9893643



 There are no tabs but when adding to a plot with mtext like:



 op - par(mfcol = c(2, 1), oma=c(0,0,4,0))

 . . . . .

 a - capture.output(print(emt))

 mtext(a[1], line= 1, side=3, outer=TRUE)

 mtext(a[2], line= 0, side=3, outer=TRUE)

 mtext(a[3], line=-1, side=3, outer=TRUE)


 The plotted text is not aligned like when it is displayed on the console. I
 have looked at the strings and they all have the same length so it seems
 that mtext is doing something with the spaces so that the output is not
 aligned. Any ideas on how I can get it aligned (by column)?

The default font used for titles is not proportionally spaced, at
least on my linux
system, so of course they won't line up.

Try:

a - c(ME RMSE  MAE   MPE
 MAPE  MASE,
original -1.034568e+07 1.097695e+08 2.433160e+07 -31.30554  37.47713
1.5100050,
xreg  1.561235e+01 2.008599e+03 9.089473e+02 267.05490 280.66734
0.9893643)

par(mfcol=c(2,1), oma=c(0,0,4,0))
plot(1:10, 1:10)
plot(1:10, 1:10)

par(family=mono)
mtext(a[1], line= 1, side=3, outer=TRUE)
mtext(a[2], line= 0, side=3, outer=TRUE)
mtext(a[3], line=-1, side=3, outer=TRUE)

Or whatever the appropriate font family specification for your OS is.

Sarah



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove names and more from list with capture.output()



On Nov 15, 2011, at 2:43 PM, Sverre Stausland wrote:


Thanks David - this is pretty close to what I am looking for. However,
the output of vec[2] now includes the row number [1] and quotations
marks at the endpoints of the row. Is there an easy way to exclude
those?



The usual method is to use cat() rather than print(). Those items are  
not part of the vector.


--
David.

Thanks
Sverre

On Tue, Nov 15, 2011 at 8:11 AM, David Winsemius dwinsem...@comcast.net 
 wrote:


On Nov 14, 2011, at 11:49 PM, Sverre Stausland wrote:


Hi R users,

I end up with a list object after running an anova:


lm(speed ~ 1 + dist + speed:dist, data = cars) - Int
lm(speed ~ 1 + dist, data = cars) - NoInt
anova(Int, NoInt) - test
test - test[c(Df, F, Pr(F))][2,]
is.list(test)


[1] TRUE


test


 Df  FPr(F)
2 -1 18.512 8.481e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

I would like to use capture.output() when writing this information  
to
a text file, but I would like to only print the row, not the names  
(Df

   FPr(F)), and not the significance codes. That is, I want the
text printed to my text file to be:

2 -1 18.512 8.481e-05 ***

the output of capture.output is just going to be a character vector  
and you

should select the second element.


vec - capture.output(test)
vec[2]

[1] 2 -1 18.512 8.481e-05 ***


Is there a way to do this?

Thanks
Sverre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading a specific column of a csv file in a loop

2011-11-15 Thread Jan van der Laan


Yet another solution. This time using the LaF package:

library(LaF)
d-c(1,4,7,8)
P1 - laf_open_csv(M1.csv, column_types=rep(double, 10), skip=1)
P2 - laf_open_csv(M2.csv, column_types=rep(double, 10), skip=1)
for (i in d) {
  M-data.frame(P1[, i],P2[, i])
}

(The skip=1 is needed as laf_open_csv doesn't read headers)

Jan



On 11/08/2011 11:04 AM, Sergio René Araujo Enciso wrote:

Dear all:

I have two larges files with 2000 columns. For each file I am
performing a loop to extract the ith element of each file and create
a data frame with both ith elements in order to perform further
analysis. I am not extracting all the ith elements but only certain
which I am indicating on a vector called d.

See  an example of my  code below

### generate an example for the CSV files, the original files contain
more than 2000 columns, here for the sake of simplicity they have only
10 columns
M1-matrix(rnorm(1000), nrow=100, ncol=10,
dimnames=list(seq(1:100),letters[1:10]))
M2-matrix(rnorm(1000), nrow=100, ncol=10,
dimnames=list(seq(1:100),letters[1:10]))
write.table(M1, file=M1.csv, sep=,)
write.table(M2, file=M2.csv, sep=,)

### the vector containing the i elements to be read
d-c(1,4,7,8)
P1-read.table(M1.csv, header=TRUE)
P2-read.table(M1.csv, header=TRUE)
for (i in d) {
M-data.frame(P1[i],P2[i])
rm(list=setdiff(ls(),d))
}

As the files are quite large, I want to include read.table within
the loop so as it only read the ith element. I know that there is
the option colClasses for which I have to create a vector with zeros
for all the columns I do not want to load. Nonetheless I have no idea
how to make this vector to change in the loop, so as the only element
with no zeros is the ith element following the vector d. Any ideas
how to do this? Or is there anz other approach to load only an
specific element?

best regards,

Sergio René

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package installtion

Never mind-I fixed it.

My script is throwing the following error: 

Error in glmer(formula = modelformula, data = data, family = binomial(link = 
logit),  : 
  Argument ‘method’ is deprecated.
Use ‘nAGQ’ to choose AGQ.  PQL is not available.

I remember hearing somewhere that PQL is no longer available on lme4 but I have 
AGQ specified.    

Here's the line that fits my model:

(fitmodel - lmer(modelformula,data,family=binomial(link=logit),method=AGQ))

If I change it to nAGQ I still get an error.

Any ideas as to what's going on?
 
- Original Message -
From: Scott Raynaud scott.rayn...@yahoo.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Tuesday, November 15, 2011 1:50 PM
Subject: package installtion

I'm getting the following error in a script: Error: could not find function 
lmer.    I'm wondering of my lme4 package is installed incorrectly.  Can 
someone tell me the installation procedure?  I looked at the support docs but 
couldn't translate that into anything that would work.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] correlations between columns for each row

2011-11-15 Thread Rob Griffin


Error in cor(x[a], x[b], use = complete.obs) : 'x' must be numeric

This is strange, it works on your example (and you've understood what I'm 
trying to do perfectly), but then when I use it on the original data it 
comes up with the error above

I've checked str() and the columns are all numeric

???

-Original Message- 
From: Joshua Wiley

Sent: Tuesday, November 15, 2011 7:14 PM
To: robgriffin247
Cc: r-help@r-project.org
Subject: Re: [R] correlations between columns for each row

Hi Rob,

Here is one approach:


## define a function that does the calculations
## (the covariance of two vectors divided by the square root of
## the products of their variances is just a correlation)
rF - function(x, a, b) cor(x[a], x[b], use = complete.obs)

set.seed(1)
bigdata - matrix(rnorm(271 * 13890), ncol = 271)

results - apply(bigdata, 1, FUN = rF, a = 174:213, b = 214:253)

## combine
bigdata - cbind(bigdata, iecorr = results)

Hope this helps,

Josh

On Tue, Nov 15, 2011 at 8:42 AM, robgriffin247
robgriffin...@hotmail.com wrote:

Just as an update on this problem:
I have managed to get the variance for the selected columns

Now all I need is the covariance between these 2 selections -
the two target columns are and the aim is that a new column contain a
covariance value between these on each row:

maindata[,c(174:213)] and maindata[,c(214:253]

I've played around with all sorts of apply (and derivatives of apply) and 
in
various different setups so I think I'm close but I feel like I'm chasing 
my

tail here!

--
View this message in context: 
http://r.789695.n4.nabble.com/correlations-between-columns-for-each-row-tp4039193p4073208.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract pattern from string

2011-11-15 Thread syrvn

Wow, that is a very clever way to do it.

Thank you very much!


Cheers,

Syrvn

--
View this message in context: 
http://r.789695.n4.nabble.com/Extract-pattern-from-string-tp4073432p4074023.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot alignment with mtext

2011-11-15 Thread R. Michael Weylandt michael.weyla...@gmail.com

I hadn't considered altering the font. Thank you I will try that.

-Original Message-
From: Sarah Goslee [mailto:sarah.gos...@gmail.com] 
Sent: Tuesday, November 15, 2011 1:53 PM
To: Kevin Burton
Cc: r-help@r-project.org
Subject: Re: [R] Plot alignment with mtext

Hi Kevin,

On Tue, Nov 15, 2011 at 2:36 PM, Kevin Burton rkevinbur...@charter.net
wrote:
 I would like the text plotted with 'mtext' to be alighned like it is 
 for printing on the console. Here is what I have:

You don't provide any of the info in the posting guide (OS may be important
here), or a reproducible example, which would also be helpful.

But see below anyway.

 print(emt)

                    ME                   RMSE                  MAE MPE    

 MAPE         MASE

 original -1.034568e+07 1.097695e+08 2.433160e+07 -31.30554   37.47713
 1.5100050

 xreg        1.561235e+01 2.008599e+03 9.089473e+02 267.05490 280.66734
 0.9893643

 a - capture.output(print(emt))

 a

 [1]                     ME         RMSE          MAE       MPE      
 MAPE MASE

 [2] original -1.034568e+07 1.097695e+08 2.433160e+07 -31.30554  
 37.47713 1.5100050

 [3] xreg      1.561235e+01 2.008599e+03 9.089473e+02 267.05490 
 280.66734 0.9893643

 There are no tabs but when adding to a plot with mtext like:

 op - par(mfcol = c(2, 1), oma=c(0,0,4,0))

 . . . . .

 a - capture.output(print(emt))

 mtext(a[1], line= 1, side=3, outer=TRUE)

 mtext(a[2], line= 0, side=3, outer=TRUE)

 mtext(a[3], line=-1, side=3, outer=TRUE)

 The plotted text is not aligned like when it is displayed on the 
 console. I have looked at the strings and they all have the same 
 length so it seems that mtext is doing something with the spaces so 
 that the output is not aligned. Any ideas on how I can get it aligned (by
column)?

The default font used for titles is not proportionally spaced, at least on
my linux system, so of course they won't line up.

Try:

a - c(ME RMSE  MAE   MPE
 MAPE  MASE,
original -1.034568e+07 1.097695e+08 2.433160e+07 -31.30554  37.47713
1.5100050,
xreg  1.561235e+01 2.008599e+03 9.089473e+02 267.05490 280.66734
0.9893643)

par(mfcol=c(2,1), oma=c(0,0,4,0))
plot(1:10, 1:10)
plot(1:10, 1:10)

par(family=mono)
mtext(a[1], line= 1, side=3, outer=TRUE) mtext(a[2], line= 0, side=3,
outer=TRUE) mtext(a[3], line=-1, side=3, outer=TRUE)

Or whatever the appropriate font family specification for your OS is.

Sarah

--
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] correlations between columns for each row

Is the whole thing a data frame? Then any multi-column subset is also a data 
frame. Try adding a as.matrix() wrapper  in the definition of rF. 

Michael

On Nov 15, 2011, at 3:14 PM, Rob Griffin robgriffin...@hotmail.com wrote:

 Error in cor(x[a], x[b], use = complete.obs) : 'x' must be numeric
 
 This is strange, it works on your example (and you've understood what I'm 
 trying to do perfectly), but then when I use it on the original data it comes 
 up with the error above
 I've checked str() and the columns are all numeric
 
 ???
 
 -Original Message- From: Joshua Wiley
 Sent: Tuesday, November 15, 2011 7:14 PM
 To: robgriffin247
 Cc: r-help@r-project.org
 Subject: Re: [R] correlations between columns for each row
 
 Hi Rob,
 
 Here is one approach:
 
 
 ## define a function that does the calculations
 ## (the covariance of two vectors divided by the square root of
 ## the products of their variances is just a correlation)
 rF - function(x, a, b) cor(x[a], x[b], use = complete.obs)
 
 set.seed(1)
 bigdata - matrix(rnorm(271 * 13890), ncol = 271)
 
 results - apply(bigdata, 1, FUN = rF, a = 174:213, b = 214:253)
 
 ## combine
 bigdata - cbind(bigdata, iecorr = results)
 
 Hope this helps,
 
 Josh
 
 On Tue, Nov 15, 2011 at 8:42 AM, robgriffin247
 robgriffin...@hotmail.com wrote:
 Just as an update on this problem:
 I have managed to get the variance for the selected columns
 
 Now all I need is the covariance between these 2 selections -
 the two target columns are and the aim is that a new column contain a
 covariance value between these on each row:
 
 maindata[,c(174:213)] and maindata[,c(214:253]
 
 I've played around with all sorts of apply (and derivatives of apply) and in
 various different setups so I think I'm close but I feel like I'm chasing my
 tail here!
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/correlations-between-columns-for-each-row-tp4039193p4073208.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Joshua Wiley
 Ph.D. Student, Health Psychology
 Programmer Analyst II, ATS Statistical Consulting Group
 University of California, Los Angeles
 https://joshuawiley.com/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package installtion

OK, I think I see the problem.  Rather than setting method=nAGQ I need 
nAGQ=1.  Doing so throws the following error:

Warning messages:
1: glm.fit: algorithm did not converge 
2: In mer_finalize(ans) : gr cannot be computed at initial par (65)
Error in diag(vcov(fitmodel)) : 
  error in evaluating the argument 'x' in selecting a method for function 
'diag': Error in asMethod(object) : matrix is not symmetric [1,2]

I need some help interpreting and debugging this.  One thing that I suspect is 
that there is a column of zeroes in the design matrix, but I'm not sure.  Any 
other possibilities here and how can I diagnose?

- Original Message -
From: Scott Raynaud scott.rayn...@yahoo.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Tuesday, November 15, 2011 2:11 PM
Subject: Re: package installtion

Never mind-I fixed it.

My script is throwing the following error: 

Error in glmer(formula = modelformula, data = data, family = binomial(link = 
logit),  : 
  Argument ‘method’ is deprecated.
Use ‘nAGQ’ to choose AGQ.  PQL is not available.

I remember hearing somewhere that PQL is no longer available on lme4 but I have 
AGQ specified.    

Here's the line that fits my model:

(fitmodel - lmer(modelformula,data,family=binomial(link=logit),method=AGQ))

If I change it to nAGQ I still get an error.

Any ideas as to what's going on?
 
- Original Message -
From: Scott Raynaud scott.rayn...@yahoo.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Tuesday, November 15, 2011 1:50 PM
Subject: package installtion

I'm getting the following error in a script: Error: could not find function 
lmer.    I'm wondering of my lme4 package is installed incorrectly.  Can 
someone tell me the installation procedure?  I looked at the support docs but 
couldn't translate that into anything that would work.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] correlations between columns for each row

2011-11-15 Thread Rob Griffin

Excellent, as.matrix() didn't work but switched it to as.numeric() around 
the definition of both variables in the function and it did work:


rF - function(x, a, b) cor(as.numeric(x[a]), as.numeric(x[b]), use = 
complete.obs)

maindata$rFcor-apply(maindata,1,FUN=rF,a=174:213,b=214:253)

Thanks very much both of you!
Rob

-Original Message- 
From: R. Michael Weylandt michael.weyla...@gmail.com

Sent: Tuesday, November 15, 2011 9:28 PM
To: Rob Griffin
Cc: Joshua Wiley ; r-help@r-project.org
Subject: Re: [R] correlations between columns for each row

Is the whole thing a data frame? Then any multi-column subset is also a data 
frame. Try adding a as.matrix() wrapper  in the definition of rF.


Michael

On Nov 15, 2011, at 3:14 PM, Rob Griffin robgriffin...@hotmail.com 
wrote:



Error in cor(x[a], x[b], use = complete.obs) : 'x' must be numeric

This is strange, it works on your example (and you've understood what I'm 
trying to do perfectly), but then when I use it on the original data it 
comes up with the error above

I've checked str() and the columns are all numeric

???

-Original Message- From: Joshua Wiley
Sent: Tuesday, November 15, 2011 7:14 PM
To: robgriffin247
Cc: r-help@r-project.org
Subject: Re: [R] correlations between columns for each row

Hi Rob,

Here is one approach:


## define a function that does the calculations
## (the covariance of two vectors divided by the square root of
## the products of their variances is just a correlation)
rF - function(x, a, b) cor(x[a], x[b], use = complete.obs)

set.seed(1)
bigdata - matrix(rnorm(271 * 13890), ncol = 271)

results - apply(bigdata, 1, FUN = rF, a = 174:213, b = 214:253)

## combine
bigdata - cbind(bigdata, iecorr = results)

Hope this helps,

Josh

On Tue, Nov 15, 2011 at 8:42 AM, robgriffin247
robgriffin...@hotmail.com wrote:

Just as an update on this problem:
I have managed to get the variance for the selected columns

Now all I need is the covariance between these 2 selections -
the two target columns are and the aim is that a new column contain a
covariance value between these on each row:

maindata[,c(174:213)] and maindata[,c(214:253]

I've played around with all sorts of apply (and derivatives of apply) and 
in
various different setups so I think I'm close but I feel like I'm chasing 
my

tail here!

--
View this message in context: 
http://r.789695.n4.nabble.com/correlations-between-columns-for-each-row-tp4039193p4073208.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between two time series

It's not clear what it means for the differences to be of increasing
order but if you simply mean the differences are increasing, perhaps
something like this will work:

library(caTools)
X = cumsum( 2*(runif(5e4)  0.5) - 1) # Create a random Walk
Y = runmean(X, 30, endrule = mean, align = right)

D = X - Y # Create the difference series:

# Now we need to find the ranges of increasing: to do this, we can just lag D

sign(D - c(0, D[1:(length(D)-1)]))

If you want to find the length of each run or to find runs of a
certain length, try rle().

Michael

On Tue, Nov 15, 2011 at 2:18 PM, Sarwarul Chy sarwar.sha...@gmail.com wrote:
 Hello,

 Can you please help me with this? I am also stack in the same problem.

 Sam

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Difference-between-two-time-series-tp819843p4073800.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] using tapply() with the quantile function?

2011-11-15 Thread Kulpanowski, David

Hi:

Summary:
I am trying to determine the 90th percentile of ambulance response times for 
groups of data.

Background:
A fire chief would like to look at emergency response times at the 90th 
percentile for 1 kilometer grids in Cape Coral, Florida. I have mapped out 
ambulance response times on a GIS map. Then I superimpose a regularly-spaced 
grid over the response times and spatially join the ambulance responses with 
the grids. Therefore each emergency incident has a grid ID and a response time. 
This is exported out as a text file and read into R.

Using R I issue the command tapply(Cape $ ResponseTime, Cape $ Grid_ID, mean) 
and this gives me the mean average of the response times for each 1 kilometer 
grid. This returns a result. It is not in the format I wanted but I can work on 
that as soon as I get the percentile function working. I am hoping to get a 
list which I can write to a text file so I can join the data back into my GIS 
based on the Grid ID. For example:

Grid_ID, MeanAverageResponseTime
1848, 450   (or some number)
1849, 470
1850, 389
etc
etc

Problem:
I am expecting that this command will give me the 90th percentile tapply(Cape, 
Cape $ Grid_ID, quantile(Cape $ ResponseTime, 0.9)). However the error message 
that is returned is: Error in match.fun(FUN)  : 'quantile(Cape$Responsetime, 
0.9)' is not a function, character or symbol.
What I am hoping to get back is the following:

Grid_ID, 90thPercentileResponseTime
1848, 430   (or some number)
1849, 441
1850, 360
etc
etc
This would then be joined in my GIS map by the Grid_ID and I could then make a 
map showing the variation of response times at the 90th percentile.

I can't get past this error message.
Question 1.) Why would tapply work for mean but not for quantile?
Question 2.) What is the correct syntax?
Question 3.) How do I get the results to look like a comma delimited list as 
shown above?

Snap shot of data to play with:

Grid_ID, ResponseTime
1848, 429
1848, 122
1848, 366
1848, 311
1848, 337
1848, 245
1848, 127
1848, 596
1848, 356
1848, 239
1848, 159
1848, 366
1848, 457
1848, 145
1848, 198
1848,  68
1848, 224
1848, 226
1849, 592
1849, 424
1849, -52
1849, 196
1849, 194
1850, 351
1854, 316
1855, 650
1858, 628
1858, 466
1861, 133
1861, 137
1871, 359
1872, 580
1872, 548
1874, 469

feel free to copy this raw data into a notepad text file. Name it Cape.txt on 
your C: drive. Then in the R console I am using the following to read it in:
Cape - read.table(C:/Cape.txt, sep=,, header=TRUE)

thanks

David Kulpanowski
Database Analyst
Lee County Public Safety
PO Box 398
Fort Myers, FL 33902
(ph) 239-533-3962
dkulpanow...@leegov.com
Latitude 26.528843
Longitude -81.861486


Please note: Florida has a very broad public records law. Most written 
communications to or from County Employees and officials regarding County 
business are public records available to the public and media upon request. 
Your email communication may be subject to public disclosure.

Under Florida law, email addresses are public records. If you do not want your 
email address released in response to a public records request, do not send 
electronic mail to this entity.  Instead, contact this office by phone or in 
writing.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using tapply() with the quantile function?

1) tapply will work for quantile, but the syntax was a little off: try this

tapply(Cape $ ResponseTime, Cape $ Grid_ID, quantile, c(0.05, 0.95))

The fourth argument is additional parameters passed to the function
FUN which here is quantile. You could also do this
tapply(Cape $ ResponseTime, Cape $ Grid_ID, function(x) quantile(x,
c(0.05, 0.95)))

2) See 1

3) I can do it with the simplify2array() function but I would have
expected the simplify = T argument to tapply() to get the job done.
Let me look into that and get back to you -- I know for sapply()
simplify = T is what calls simplify2array() so I'm pondering.

Thanks for spending so much time on a well-crafted question,

Michael

PS -- An even easier way to send data via plain text is to use dput()
which creates code that can be directly pasted into an R session to
replicate your data. Super helpful for stuff like this


On Tue, Nov 15, 2011 at 2:54 PM, Kulpanowski, David
dkulpanow...@leegov.com wrote:
 Hi:

 Summary:
 I am trying to determine the 90th percentile of ambulance response times for 
 groups of data.

 Background:
 A fire chief would like to look at emergency response times at the 90th 
 percentile for 1 kilometer grids in Cape Coral, Florida. I have mapped out 
 ambulance response times on a GIS map. Then I superimpose a regularly-spaced 
 grid over the response times and spatially join the ambulance responses with 
 the grids. Therefore each emergency incident has a grid ID and a response 
 time. This is exported out as a text file and read into R.

 Using R I issue the command tapply(Cape $ ResponseTime, Cape $ Grid_ID, 
 mean) and this gives me the mean average of the response times for each 1 
 kilometer grid. This returns a result. It is not in the format I wanted but I 
 can work on that as soon as I get the percentile function working. I am 
 hoping to get a list which I can write to a text file so I can join the data 
 back into my GIS based on the Grid ID. For example:

 Grid_ID, MeanAverageResponseTime
 1848, 450       (or some number)
 1849, 470
 1850, 389
 etc
 etc

 Problem:
 I am expecting that this command will give me the 90th percentile 
 tapply(Cape, Cape $ Grid_ID, quantile(Cape $ ResponseTime, 0.9)). However 
 the error message that is returned is: Error in match.fun(FUN)  : 
 'quantile(Cape$Responsetime, 0.9)' is not a function, character or symbol.
 What I am hoping to get back is the following:

 Grid_ID, 90thPercentileResponseTime
 1848, 430       (or some number)
 1849, 441
 1850, 360
 etc
 etc
 This would then be joined in my GIS map by the Grid_ID and I could then make 
 a map showing the variation of response times at the 90th percentile.

 I can't get past this error message.
 Question 1.) Why would tapply work for mean but not for quantile?
 Question 2.) What is the correct syntax?
 Question 3.) How do I get the results to look like a comma delimited list as 
 shown above?

 Snap shot of data to play with:

 Grid_ID, ResponseTime
 1848, 429
 1848, 122
 1848, 366
 1848, 311
 1848, 337
 1848, 245
 1848, 127
 1848, 596
 1848, 356
 1848, 239
 1848, 159
 1848, 366
 1848, 457
 1848, 145
 1848, 198
 1848,  68
 1848, 224
 1848, 226
 1849, 592
 1849, 424
 1849, -52
 1849, 196
 1849, 194
 1850, 351
 1854, 316
 1855, 650
 1858, 628
 1858, 466
 1861, 133
 1861, 137
 1871, 359
 1872, 580
 1872, 548
 1874, 469

 feel free to copy this raw data into a notepad text file. Name it Cape.txt 
 on your C: drive. Then in the R console I am using the following to read it 
 in:
 Cape - read.table(C:/Cape.txt, sep=,, header=TRUE)

 thanks

 David Kulpanowski
 Database Analyst
 Lee County Public Safety
 PO Box 398
 Fort Myers, FL 33902
 (ph) 239-533-3962
 dkulpanow...@leegov.com
 Latitude 26.528843
 Longitude -81.861486


 Please note: Florida has a very broad public records law. Most written 
 communications to or from County Employees and officials regarding County 
 business are public records available to the public and media upon request. 
 Your email communication may be subject to public disclosure.

 Under Florida law, email addresses are public records. If you do not want 
 your email address released in response to a public records request, do not 
 send electronic mail to this entity.  Instead, contact this office by phone 
 or in writing.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] using tapply() with the quantile function?