Re: [R] Plotting in x and y Graph Using 1st Quadrant Only

2012-07-28 Thread R. Michael Weylandt michael.weyla...@gmail.com
plot(1:5) seems to suffice. 

More generally,

plot(x, y, xlim = c(0, max(x)), ylim = c(0, min(y))) 

if I understand your request. 

Michael

On Jul 27, 2012, at 5:58 PM, Matilda E. Gogos matildaelizabe...@gmail.com 
wrote:

 How can I graph in x and y using the 1st quadrant, using the cartesian
 plane, in r? I use a mac 10.6.8. Will I be able to print this out?
 
 -- 
 Matilda Gogos
 matildaelizabe...@gmail.com
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] limit of detection (LOD) by logistic regression

2012-07-28 Thread Luigi
 

Dear all,

I am trying to apply the logistic regression to determine the limit of
detection (LOD) of a molecular biology assay, the polymerase chain reaction
(PCR). The aim of the procedure is to identify the value (variable
dilution) that determine a 95% probability of success, that is
positive/total=0.95. The procedure I have implemented seemed to work
looking at the figure obtained from the sample set 1; however the figure
obtained from the sample set 2 shows that interpolation is not correct.
Admittedly, these are not the best sample sets to determine the LOD -there
are too many 100% successes - but the regression should still work. 

Does anybody have some tip to solve this incongruence?

Many thanks,

Luigi Marongiu, MSc

 




 

### SAMPLE SET No. 1

 

 

### define data

 labs-c(dilution, total, positive)

 p.1-c(10, 20,  19)

 p.2-c(100, 20, 20)

 p.3-c(1000, 20, 20)

 p.4-c(1, 20, 20)

 p.5-c(10, 20, 20)

 p.6-c(100, 20, 20)

 

### create data frame from matrix

 my.mat-matrix(c(p.1, p.2, p.3, p.4, p.5, p.6), nrow=6, byrow=TRUE)

 dimnames(my.mat)-list(c(1:6),labs)

 my.data-as.data.frame(my.mat)

 attach(my.data)

 my.data

 

### define frequency data

 Y-cbind(positive, total-positive)

 

 

### fit model LOGIT

 model-glm(Y ~ dilution, family=binomial(link=logit), data=my.data)

 

### plot data and regression line

 plot(dilution, positive/total, pch=16, ylim=c(0,1), log = x,
xaxt=n, xlim=c(1, 100),

main=Positive amplification by PCR, ylab=Fraction amplified,
xlab=expression(paste(Standard dilutions (c/,mu,l

 

### add x axis (William Dunlap, personal communication)

 xlim - par(usr)[1:2]

 lab.x - seq(ceiling(xlim[1]), floor(xlim[2]))

 axis(side=1, at=10^lab.x, lab=as.expression(lapply(lab.x,
function(x)bquote(10^.(x)

 

 lines(dilution, fitted(model), lty=1)

 

### set significance level and logit 

 # define significance

   level-0.95 

 # set logit value

   logit.LOD-log(level/(1-level))

 

### set function to determine LOD (Vito Ricci, personal communication)

 LOD-function(mod) as.vector((logit.LOD-coef(mod)[1])/coef(mod)[2])

 

### computation of the LOD value

 x.LOD-LOD(model)

 

### compute S.E.

 # set variables

   co-model$coef

   model.sum-summary(model)

   SE.co-model.sum$coef[,2]

   COV.co-model.sum$cov.scaled[1,2]

 # compute SE

   SE.LOD-abs(x.LOD) * sqrt( (SE.co[1]/co[1])^2 +

  (SE.co[2]/co[2])^2 -

  2*(COV.co/(co[1]*co[2])) )

   log.SE.LOD-log10(SE.LOD)

 

 # define boundaries

   bottom-x.LOD-(1.96*log.SE.LOD)

   ceiling-x.LOD+(1.96*log.SE.LOD)

 

 

### REPORT

 # mean value

   x.LOD

 # bottom

   bottom

 # ceiling

   ceiling

 

### plot LOD

 lines(c(0.1, x.LOD), c(0.95, 0.95), lty=2)

 lines(c(x.LOD, x.LOD), c(0.95, 0), lty=2)

 text(x.LOD, 0.02, labels = round(x.LOD, digits = 2), pos=4, offset =
0.3, cex = 0.7)

 text(1, 0.93, label=95%, pos=4, offset = 0.3, cex = 0.7)

 

 lines(c(bottom, ceiling), c(0,0), lty=1, lend=butt, lwd=0.7)

 points(x.LOD, 0, pch=21, cex=0.8, bg=white)

 

detach(my.data)


###

 


###

### SAMPLE SET No. 2

 

 

### define data

 labs-c(dilution, total, positive)

 p.1-c(10, 10,  4)

 p.2-c(100, 10, 10)

 p.3-c(1000, 10, 10)

 p.4-c(1, 10, 10)

 p.5-c(10, 10, 10)

 p.6-c(100, 10, 10)

 

### create data frame from matrix

 my.mat-matrix(c(p.1, p.2, p.3, p.4, p.5, p.6), nrow=6, byrow=TRUE)

 dimnames(my.mat)-list(c(1:6),labs)

 my.data-as.data.frame(my.mat)

 attach(my.data)

 my.data

 

### define frequency data

 Y-cbind(positive, total-positive)

 

 

### fit model LOGIT

 model-glm(Y ~ dilution, family=binomial(link=logit), data=my.data)

 

### plot data and regression line

 plot(dilution, positive/total, pch=16, ylim=c(0,1), log = x,
xaxt=n, xlim=c(1, 100),

main=Positive amplification by PCR, ylab=Fraction amplified,
xlab=expression(paste(Standard dilutions (c/,mu,l

 

### add x axis (William Dunlap, personal communication)

 xlim - par(usr)[1:2]

 lab.x - seq(ceiling(xlim[1]), floor(xlim[2]))

 axis(side=1, at=10^lab.x, lab=as.expression(lapply(lab.x,
function(x)bquote(10^.(x)

 

 lines(dilution, fitted(model), lty=1)

 

### set significance level and logit 

 # define significance

   level-0.95 

 # 

Re: [R] Error accessing Vegan package

2012-07-28 Thread Bob Green
In relation to where I was trying to access the zip file it was here: 
http://cran.r-project.org/web/bin/windows/contrib/r-release/vegan_2.0-4.zip
I used this method as my work computer does not allow me to directly 
install packages. Instead I have to install through the local zip file option.


I had no trouble with accessing the package via 
http://probability.ca/cran/bin/windows/contrib/r-release/vegan_2.0-4.zip
that Ben Bolker suggested. I was also able to access Vegan via the 
Install packages in R option, albeit only when at home..


Bob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating sparse matrix of type dgCMatrix directly

2012-07-28 Thread Søren Højsgaard
I want to create a sparse matrix of type dgCMatrix using the Matrix package 
(and the matrix must be of this type even if other more compact representations 
may exist). I do

 library(Matrix)
 m1-Matrix(rep(1,4),nrow=2,ncol=2,sparse=T)
 m1
2 x 2 sparse Matrix of class dsCMatrix

[1,] 1 1
[2,] 1 1

To convert m1, I do
 as(m1, dgCMatrix)
2 x 2 sparse Matrix of class dgCMatrix

[1,] 1 1
[2,] 1 1

Is it possible to construct a dgCMatrix directly i.e. without going through 
the additional as() step above?

Best regards
Søren

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error accessing Vegan package

2012-07-28 Thread Ben Bolker
On 12-07-28 06:49 AM, Bob Green wrote:
 In relation to where I was trying to access the zip file it was here:
 http://cran.r-project.org/web/bin/windows/contrib/r-release/vegan_2.0-4.zip

  I think you had the wrong URL ... ? I was able to find it at

http://cran.r-project.org/bin/windows/contrib/r-release/vegan_2.0-4.zip

(/web deleted)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] metafor package, proportions: single groups wrt to a categorical dependent variable

2012-07-28 Thread Michael Dewey

At 01:44 28/07/2012, Dushanthi Pinnaduwage wrote:




Dear all, I am using R version 2.15.0 and 'metafor' package version 
1.6-0. Can this version of the package handle proportions from 
a  categorical dependent variable for single studies?If so how do I 
set up my dataframe for the raw data from different studies? Also 
how do I give inputs, specially xi, mi (or ni) to the function 
escalc()? Thanks,Dushanthi


You do not give us very much to go on here. When you say your outcome 
is proportions do you mean your outcome has more than two values and 
you are generating (or have been provided with) several proportions 
for each study or do you mean it is a binary variable? If the latter 
do you in fact have the numerators and denominators or just the 
proportion? It would also help if you showed us the code you have 
tried and the error message you got (if any) or told us the 
discrepancy between the output you obtained and that which you expected.




[[alternative HTML version deleted]]


Michael Dewey
i...@aghmed.fsnet.co.uk
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A question regarding hypergeometric test

2012-07-28 Thread rad mac
Dear all,

There is a simple question regarding gene set enrichment analysis.
Say, we have a simple denominator and numerator, therefore
hypergeometric test looks like:

p=phyper(white-1,total white,total black,drawn).

However, there is a question regarding database size. Say, my
denominator (total genes on array) is equal to 1. However,
database (say GO database) harbor only 8000 from this 1. The
question is should I subtract genes from all values in phyper that do
not fall into the database? By other words:

original function ie: phyper(50,200,9800,500).

subtract genes that didn't fall into database for example:
phyper(50,180,7700,400).

Should I correct my gene lists with database records? Which way is correct?

Thank you in advance for the replies.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] metafor package, proportions: single groups wrt to a categorical dependent variable

2012-07-28 Thread Michael Dewey

At 15:47 28/07/2012, Dushanthi Pinnaduwage wrote:


Dear Michael,

Thanks very much for your quick response.

My outcome is not binary. Outcome has more than 2 values (levels) 
and I have counts for the levels.

I have not written a code yet.


Well as the name of the key function in metafor (rma.uni) suggests 
metafor at present does univariate meta-analysis. If you can reduce 
the outcome to a single measure of effect (perhaps using proportional 
odds or continuation ratio) then you can enter that with its standard 
error using the yi and sei parameters to rma.uni. If the nature of 
your outcome does not allow that then you have a multivariate 
meta-analysis. In that case you could try mvmeta (available from CRAN)




I would like to know if escalc() can handle categorical outcome data 
(more than 2 proportions). If so, how can I input

the counts of each level of the categorical outcome to this function.

Hope it is little clear now.

Dushanthi



You do not give us very much to go on here. When you say your outcome
is proportions do you mean your outcome has more than two values and
 you are generating (or have been provided with) several proportions
 for each study or do you mean it is a binary variable? If the latter
 do you in fact have the numerators and denominators or just the
 proportion? It would also help if you showed us the code you have
 tried and the error message you got (if any) or told us the
 discrepancy between the output you obtained and that which you expected.



 Date: Sat, 28 Jul 2012 14:25:08 +0100
 To: dushan...@bell.net; r-help@r-project.org
 From: i...@aghmed.fsnet.co.uk
 Subject: Re: [R] metafor package, proportions: single groups 
wrt to a categorical dependent variable


 At 01:44 28/07/2012, Dushanthi Pinnaduwage wrote:




 Dear all, I am using R version 2.15.0 and 'metafor' package version
 1.6-0. Can this version of the package handle proportions from
 a categorical dependent variable for single studies?If so how do I
 set up my dataframe for the raw data from different studies? Also
 how do I give inputs, specially xi, mi (or ni) to the function
 escalc()? Thanks,Dushanthi

 You do not give us very much to go on here. When you say your outcome
 is proportions do you mean your outcome has more than two values and
 you are generating (or have been provided with) several proportions
 for each study or do you mean it is a binary variable? If the latter
 do you in fact have the numerators and denominators or just the
 proportion? It would also help if you showed us the code you have
 tried and the error message you got (if any) or told us the
 discrepancy between the output you obtained and that which you expected.

 
  [[alternative HTML version deleted]]

 Michael Dewey
 i...@aghmed.fsnet.co.uk
 http://www.aghmed.fsnet.co.uk/home.html



Michael Dewey
i...@aghmed.fsnet.co.uk
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] network plot problem

2012-07-28 Thread Vlatka Matkovic Puljic
Is it possible to write a numbers of link next to lines of the network?

For exmple, I have 3 trials that have studied two drugs in network.
So, now it seems all links are of same weight.





2012/7/27 Vlatka Matkovic Puljic v.matkovic.pul...@gmail.com

 Thank you!

 It works fine now! :)

  2012/7/27 Rui Barradas ruipbarra...@sapo.pt

 Hello,

 As for the directed = FALSE, it works with me:

 g - graph(given, directed = FALSE)

 As for the label, it's easy. Create an attribute 'name' and then set the
 label to that attribute. For instance, using the alphabet's letters, first
 uppercase.

 V(g)$name - c(LETTERS, letters)[V(g)]  # or any other names
 V(g)$label - V(g)$name

 Keep 'name' and 'label' separate, like this you may change the label if
 needed but the name will still be there.
 Then the plot functions recognise the label attribute.


 Hope this helps,

 Rui Barradas

 Em 27-07-2012 11:37, Vlatka Matkovic Puljic escreveu:

  Thank you Rui.

 With matrix works better. I got plot I have expected to have.

 I want to be undirected. But

 directed = FALSE
 or
 as.undirected(graph)

 are not working? Where I'd gone wrong?

 Shouldn't it be possible to add names  instead of numbers of nods with
 vertex.label= ?



 2012/7/26 Rui Barradas ruipbarra...@sapo.pt

  Hello,

 I don't see the problem.

 given - scan(text=

 15 2
 10 4
 10 4
 10 4
 13 4
 13 4
 15 4
 18 4
 11 5
 2 6
 7 6
 7 6
 7 6
 12 6
 15 6
 15 6
 19 6
 22 6
 24 6
 6 7
 5 12
 5 12
 7 12
 11 12
 13 12
 13 12
 13 12
 13 12
 16 12
 17 12
 23 12
 23 12
 23 12
 23 12
 6 13
 12 13
 6 14
 6 15
 9 15
 12 15
 13 15
 17 16
 16 17
 1 18
 12 18
 23 18
 2 19
 6 19
 24 19
 21 22
 3 25
 5 26
 6 27
 7 27
 15 29
 20 30
 25 31
 28 31
 8 32
 6 33
 14 33
 22 34
 )
 mat - matrix(given, ncol=2, byrow=TRUE)

 g - graph(given)
 # Or, not run (note the transpose)
 #g - graph(t(mat))
 # The edges are exactly what is given
 E(g)

 V(g)$label - V(g)
 g$layout - layout.fruchterman.reingold
 plot(g, edge.arrow.size=0.5, edge.loop.angle=1, edge.curved=FALSE)


 So my guess is you've messed up the graph creation.
 Also, don't post datasets like that, use dput(), lik this:

 dput(given)
 c(15, 2, 10, 4, 10, 4, 10, 4, 13, 4, 13, 4, 15, 4, 18, 4, 11,
 5, 2, 6, 7, 6, 7, 6, 7, 6, 12, 6, 15, 6, 15, 6, 19, 6, 22, 6,
 24, 6, 6, 7, 5, 12, 5, 12, 7, 12, 11, 12, 13, 12, 13, 12, 13,
 12, 13, 12, 16, 12, 17, 12, 23, 12, 23, 12, 23, 12, 23, 12, 6,
 13, 12, 13, 6, 14, 6, 15, 9, 15, 12, 15, 13, 15, 17, 16, 16,
 17, 1, 18, 12, 18, 23, 18, 2, 19, 6, 19, 24, 19, 21, 22, 3, 25,
 5, 26, 6, 27, 7, 27, 15, 29, 20, 30, 25, 31, 28, 31, 8, 32, 6,
 33, 14, 33, 22, 34)

 Now all anyone has to do is to copy that output and paste it into an R
 session.
 (Try it with the matrix to see the result)

 Hope this helps,

 Rui Barradas
 Em 25-07-2012 16:27, Vlatka Matkovic Puljic escreveu:

  **Hi,


 I wanted to create a network of drugs that are being studied together.
 So, I created a file as a graph.
 But plot of my network is not corect!
 It looks like to me that something else is lying behind this network
 links
 created.

 Could someone help me with this?

 Thanx!

 a-read.graph(file=file.choose(), format=edgelist)

 plot.igraph(a)

 File (number present ID of drug):

 15 2
 10 4
 10 4
 10 4
 13 4
 13 4
 15 4
 18 4
 11 5
 2 6
 7 6
 7 6
 7 6
 12 6
 15 6
 15 6
 19 6
 22 6
 24 6
 6 7
 5 12
 5 12
 7 12
 11 12
 13 12
 13 12
 13 12
 13 12
 16 12
 17 12
 23 12
 23 12
 23 12
 23 12
 6 13
 12 13
 6 14
 6 15
 9 15
 12 15
 13 15
 17 16
 16 17
 1 18
 12 18
 23 18
 2 19
 6 19
 24 19
 21 22
 3 25
 5 26
 6 27
 7 27
 15 29
 20 30
 25 31
 28 31
 8 32
 6 33
 14 33
 22 34








  --
 **
 Vlatka Matkovic Puljic
 gsm: +32.474.894953




-- 
**
Vlatka Matkovic Puljic
gsm: +32.474.894953

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] selecting a subset of files to be processed

2012-07-28 Thread Erin Hodgess
Dear R People:

I am using a Linux system in which I have about 3000 files.

I would like to randomly select about 45 of those files to be processed in R.

Could I make the selection in R or should I do it in Linux, please?

This is with R-2.15.1.

Thanks,
erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] network plot problem

2012-07-28 Thread Rui Barradas

Hello,

It's more or less the same it was with vertices. If the edge labels 
represent weights, then create an appropriate attribute and set the 
labels' values to that attribute's values:


E(g)$weight - something   # keep the weights in a separate attribute
E(g)$label - E(g)$weight # set when needed

The plot function now uses these edge labels.

Rui Barradas

Em 28-07-2012 17:44, Vlatka Matkovic Puljic escreveu:

Is it possible to write a numbers of link next to lines of the network?

For exmple, I have 3 trials that have studied two drugs in network.
So, now it seems all links are of same weight.





2012/7/27 Vlatka Matkovic Puljic v.matkovic.pul...@gmail.com


Thank you!

It works fine now! :)

  2012/7/27 Rui Barradas ruipbarra...@sapo.pt


Hello,

As for the directed = FALSE, it works with me:

g - graph(given, directed = FALSE)

As for the label, it's easy. Create an attribute 'name' and then set the
label to that attribute. For instance, using the alphabet's letters, first
uppercase.

V(g)$name - c(LETTERS, letters)[V(g)]  # or any other names
V(g)$label - V(g)$name

Keep 'name' and 'label' separate, like this you may change the label if
needed but the name will still be there.
Then the plot functions recognise the label attribute.


Hope this helps,

Rui Barradas

Em 27-07-2012 11:37, Vlatka Matkovic Puljic escreveu:


  Thank you Rui.

With matrix works better. I got plot I have expected to have.

I want to be undirected. But

directed = FALSE
or
as.undirected(graph)

are not working? Where I'd gone wrong?

Shouldn't it be possible to add names  instead of numbers of nods with
vertex.label= ?



2012/7/26 Rui Barradas ruipbarra...@sapo.pt

  Hello,

I don't see the problem.

given - scan(text=

15 2
10 4
10 4
10 4
13 4
13 4
15 4
18 4
11 5
2 6
7 6
7 6
7 6
12 6
15 6
15 6
19 6
22 6
24 6
6 7
5 12
5 12
7 12
11 12
13 12
13 12
13 12
13 12
16 12
17 12
23 12
23 12
23 12
23 12
6 13
12 13
6 14
6 15
9 15
12 15
13 15
17 16
16 17
1 18
12 18
23 18
2 19
6 19
24 19
21 22
3 25
5 26
6 27
7 27
15 29
20 30
25 31
28 31
8 32
6 33
14 33
22 34
)
mat - matrix(given, ncol=2, byrow=TRUE)

g - graph(given)
# Or, not run (note the transpose)
#g - graph(t(mat))
# The edges are exactly what is given
E(g)

V(g)$label - V(g)
g$layout - layout.fruchterman.reingold
plot(g, edge.arrow.size=0.5, edge.loop.angle=1, edge.curved=FALSE)


So my guess is you've messed up the graph creation.
Also, don't post datasets like that, use dput(), lik this:

dput(given)
c(15, 2, 10, 4, 10, 4, 10, 4, 13, 4, 13, 4, 15, 4, 18, 4, 11,
5, 2, 6, 7, 6, 7, 6, 7, 6, 12, 6, 15, 6, 15, 6, 19, 6, 22, 6,
24, 6, 6, 7, 5, 12, 5, 12, 7, 12, 11, 12, 13, 12, 13, 12, 13,
12, 13, 12, 16, 12, 17, 12, 23, 12, 23, 12, 23, 12, 23, 12, 6,
13, 12, 13, 6, 14, 6, 15, 9, 15, 12, 15, 13, 15, 17, 16, 16,
17, 1, 18, 12, 18, 23, 18, 2, 19, 6, 19, 24, 19, 21, 22, 3, 25,
5, 26, 6, 27, 7, 27, 15, 29, 20, 30, 25, 31, 28, 31, 8, 32, 6,
33, 14, 33, 22, 34)

Now all anyone has to do is to copy that output and paste it into an R
session.
(Try it with the matrix to see the result)

Hope this helps,

Rui Barradas
Em 25-07-2012 16:27, Vlatka Matkovic Puljic escreveu:

  **Hi,


I wanted to create a network of drugs that are being studied together.
So, I created a file as a graph.
But plot of my network is not corect!
It looks like to me that something else is lying behind this network
links
created.

Could someone help me with this?

Thanx!

a-read.graph(file=file.choose(), format=edgelist)

plot.igraph(a)

File (number present ID of drug):

15 2
10 4
10 4
10 4
13 4
13 4
15 4
18 4
11 5
2 6
7 6
7 6
7 6
12 6
15 6
15 6
19 6
22 6
24 6
6 7
5 12
5 12
7 12
11 12
13 12
13 12
13 12
13 12
16 12
17 12
23 12
23 12
23 12
23 12
6 13
12 13
6 14
6 15
9 15
12 15
13 15
17 16
16 17
1 18
12 18
23 18
2 19
6 19
24 19
21 22
3 25
5 26
6 27
7 27
15 29
20 30
25 31
28 31
8 32
6 33
14 33
22 34






  --
**
Vlatka Matkovic Puljic
gsm: +32.474.894953






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selecting a subset of files to be processed

2012-07-28 Thread Joshua Wiley
Hi Erin,

It is not difficult to imagine doing it either in R or via the shell.
If they are all in the same directory, I would tend towards R, just
because you can easily set the seed and keep that information so you
can reproduce your random selection.

If the wd is in the directory with the files, from R just:

set.seed(10)
toprocess - sample(list.files(), size = 45)

Cheers,

Josh

On Sat, Jul 28, 2012 at 10:49 AM, Erin Hodgess erinm.hodg...@gmail.com wrote:
 Dear R People:

 I am using a Linux system in which I have about 3000 files.

 I would like to randomly select about 45 of those files to be processed in R.

 Could I make the selection in R or should I do it in Linux, please?

 This is with R-2.15.1.

 Thanks,
 erin


 --
 Erin Hodgess
 Associate Professor
 Department of Computer and Mathematical Sciences
 University of Houston - Downtown
 mailto: erinm.hodg...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selecting a subset of files to be processed

2012-07-28 Thread Rui Barradas

Hello,

If the files are to be processed in R select a random sample in R.
Using list.files() you can assign a character vector with the filenames 
of interest and then sample from that vector.


?list.files
filenames - list.files(path, pattern)

rand.sampl - sample(filenames, 45)

Hope this helps,

Rui Barradas

Em 28-07-2012 18:49, Erin Hodgess escreveu:

Dear R People:

I am using a Linux system in which I have about 3000 files.

I would like to randomly select about 45 of those files to be processed in R.

Could I make the selection in R or should I do it in Linux, please?

This is with R-2.15.1.

Thanks,
erin




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selecting a subset of files to be processed

2012-07-28 Thread Ted Harding
And, in addition to the tip from Rui (and similar from Joshua) below,
I would advise that there is one good reason not to try doing it
in pure Linux.

The only source (that I know of) in Linux itself for random numbers
can be tapped by something like

  cat /dev/random  filename

/dev/random stores noise generated by the timings of system events
(keyboard presses, mouse-clicks, disk accesses, interrupts, etc.)
after subjecting them to a high-entropy stirring process. See:

  man random

It yields them in the form of random bytes (each of 8 random 0/1 bits)
and you would have to devise some means of coverting those onto a
form suitable for accessing a directory listing at random. Not a
pretty task!

There is also the command 'rand' available in the openSSL toolkit,
but that still outputs the results in the same format as /dev/random.

If you really want to do this outside R, the I would suggest writing
a little C program (to be run from the Linux command line). C can
do its own random number generation, with results returned as
real (double), and then apply these to select at random from the
contents of a file generated by something like

  ls filesdir  filelist.txt

and output the random selection.

Ted.

On 28-Jul-2012 18:00:38 Rui Barradas wrote:
 Hello,
 
 If the files are to be processed in R select a random sample in R.
 Using list.files() you can assign a character vector with the filenames 
 of interest and then sample from that vector.
 
 ?list.files
 filenames - list.files(path, pattern)
 
 rand.sampl - sample(filenames, 45)
 
 Hope this helps,
 
 Rui Barradas
 
 Em 28-07-2012 18:49, Erin Hodgess escreveu:
 Dear R People:

 I am using a Linux system in which I have about 3000 files.

 I would like to randomly select about 45 of those files to be processed in
 R.

 Could I make the selection in R or should I do it in Linux, please?

 This is with R-2.15.1.

 Thanks,
 erin


 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-
E-Mail: (Ted Harding) ted.hard...@wlandres.net
Date: 28-Jul-2012  Time: 19:32:26
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating sparse matrix of type dgCMatrix directly

2012-07-28 Thread Douglas Bates
On Sat, Jul 28, 2012 at 7:26 AM, Søren Højsgaard sor...@math.aau.dk wrote:
 I want to create a sparse matrix of type dgCMatrix using the Matrix package 
 (and the matrix must be of this type even if other more compact 
 representations may exist). I do

 library(Matrix)
 m1-Matrix(rep(1,4),nrow=2,ncol=2,sparse=T)
 m1
 2 x 2 sparse Matrix of class dsCMatrix

 [1,] 1 1
 [2,] 1 1

 To convert m1, I do
 as(m1, dgCMatrix)
 2 x 2 sparse Matrix of class dgCMatrix

 [1,] 1 1
 [2,] 1 1

 Is it possible to construct a dgCMatrix directly i.e. without going through 
 the additional as() step above?

The Matrix function is a high-level function that is designed to
produce a compact and informative representation of its arguments.
That's why it checks for symmetry, triangularity, etc.

To decide how to bypass these checks it would be useful to know what
you are starting with.  Will it be a dense matrix or a triplet
representation or ...?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to extract unique indices for time series Data?

2012-07-28 Thread Christofer Bogaso

Dear all, I am into following coding:

 library(quantmod)
 Data1 - get(getSymbols(TCS.NS, from = as.Date(2008-01-01), 
return.class = zoo))[,4]

Warning messages:
1: In zoo(cd, order.by = index(x), ...) :
some methods for “zoo” objects do not work if the index entries in 
‘order.by’ are not unique

2: In zoo(rval, index(x)[i]) :
some methods for “zoo” objects do not work if the index entries in 
‘order.by’ are not unique


Here, I guess there are some duplicated dates-index. Is there any 
function available to automatically extract unique indices ???


Thanks and regards,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to extract unique indices for time series Data?

2012-07-28 Thread Bert Gunter
?unique
?duplicated


-- Bert

On Sat, Jul 28, 2012 at 12:57 PM, Christofer Bogaso
bogaso.christo...@gmail.com wrote:
 Dear all, I am into following coding:

 library(quantmod)
 Data1 - get(getSymbols(TCS.NS, from = as.Date(2008-01-01),
 return.class = zoo))[,4]
 Warning messages:
 1: In zoo(cd, order.by = index(x), ...) :
 some methods for “zoo” objects do not work if the index entries in
 ‘order.by’ are not unique
 2: In zoo(rval, index(x)[i]) :
 some methods for “zoo” objects do not work if the index entries in
 ‘order.by’ are not unique

 Here, I guess there are some duplicated dates-index. Is there any function
 available to automatically extract unique indices ???

 Thanks and regards,

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to extract unique indices for time series Data?

2012-07-28 Thread Gabor Grothendieck
On Sat, Jul 28, 2012 at 3:57 PM, Christofer Bogaso
bogaso.christo...@gmail.com wrote:
 Dear all, I am into following coding:

 library(quantmod)
 Data1 - get(getSymbols(TCS.NS, from = as.Date(2008-01-01),
 return.class = zoo))[,4]
 Warning messages:
 1: In zoo(cd, order.by = index(x), ...) :
 some methods for “zoo” objects do not work if the index entries in
 ‘order.by’ are not unique
 2: In zoo(rval, index(x)[i]) :
 some methods for “zoo” objects do not work if the index entries in
 ‘order.by’ are not unique

 Here, I guess there are some duplicated dates-index. Is there any function
 available to automatically extract unique indices ???


Try this:


library(quantmod)
Data1 - Cl(getSymbols(TCS.NS, from = as.Date(2008-01-01),
auto.assign = FALSE))
dup - duplicated(time(Data1), fromLast = TRUE)
z - as.zoo(Data1[!dup])

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] minor axis ticks in trellis graphics?

2012-07-28 Thread Peter Ehlers

Dear Martin,

Mea culpa! I screwed up. I was answering (not well, at that) a
different question and somehow managed to make it a response to
your request.

Here's what may be a solution to your problem.
(see also this post by ilai:
https://stat.ethz.ch/pipermail/r-help/attachments/20120718/7133b673/attachment.pl)

We define appropriate locations for the x-ticks to go with
suitably defined labels and corresponding tick lengths.

## data:
   d - data.frame(x = 1:11, y = rnorm(11))

## Where to put ticks and labels
   myat  - 1:11
   mylab - head(c(rbind(LETTERS[1:6], )), -1)
  # putting letters at every second tick

## lengths of ticks
   mytck - head(c(rbind(rep(1,6), .5)), -1)

## myat, mylab, mytck should all be same length

## plot
   xyplot(y ~ x, data = d, xlab=,
 scales = list(x = list(
at = myat, labels = mylab)),
 par.settings = list(
axis.components = list(
   bottom = list(tck = mytck

## Try it with different myat etc to see how it works

   myat  - seq(1, 11, by = 2/3)
   mylab - head(c(rbind(LETTERS[1:6], , )), -2)
   mytck - head(c(rbind(rep(2,6), .5, .5)), -2)

   myat  - seq(1, 11, by = 1/2)
   mylab - head(c(rbind(LETTERS[1:6], , , )), -3)
   mytck - head(c(rbind(rep(2,6), .5, 1, .5)), -3)

The trick is in constructing suitable vectors myat, mylab,
and mytck which you can do any way you like but I would
check that the lengths are equal (recycling works fine
for simple cases but always seems to trip me up in more
complicated cases).

Peter Ehlers


On 2012-07-26 07:02, Martin Ivanov wrote:


Dear Peter,

Thank You very much for your suggestion. However, it seems to me to make a 
completely new notation to the x axis. While I just need to add minor ticks to 
the axis, that is smaller ticks inbetween the basic ticks.
The other option would be to say which ticks to be small and which big, but I 
have no idea how.

Any suggestions will be appreciated.

Best regards,

Martin



 embarassing junk from P. Ehlers cut 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] puzzling classical Mahalanobis distances from covMcd() {robustbase}

2012-07-28 Thread David L Carlson
The values should probably be labeled initial instead of raw which is
how they are labeled in the source. The Details section of manual indicates
that the first step is to identify a subset of the original data between .5
and 1 whose covariance matrix has the lowest possible determinant. The next
paragraph:

The raw MCD estimate of location is then the average of these h points,
whereas the raw MCD estimate of scatter is their covariance matrix,
multiplied by a consistency factor and a finite sample correction factor (to
make it consistent at the normal model and unbiased at small samples).

Following your example:
 set.seed(42)
 x - matrix(rnorm(10*3), ncol = 3)
 xmeans - colMeans(x)
 Sx - cov(x)
 D2rb - covMcd(x)
 D2rb$raw.weights
 [1] 0 1 1 1 1 1 0 1 0 1  == Note that the raw weights eliminate obs 1, 7,
and 9
 xmeans; D2rb$raw.center 
[1]  0.5472968 -0.1634567 -0.1780795== Compare original means 
[1]  0.08172336 -0.03067387 -0.23956925 and raw means
 colMeans(x[as.logical(D2rb$raw.weights),]) == means with 1, 7, and 9
eliminated
[1]  0.08172336 -0.03067387 -0.23956925  == This matches
D2rb$raw.center

So the raw values are taken for a subset, h, which includes observations
2, 3, 4, 5, 6, 8, and 10. Given that the raw.center and raw.cov are based on
a subset of the original data, the mahalanobis distances will not be the
same either.

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Fraser D. Neiman
 Sent: Friday, July 27, 2012 7:16 AM
 To: r-help@r-project.org
 Subject: [R] puzzling classical Mahalanobis distances from covMcd()
 {robustbase}
 
 Greetings,
 
 I am puzzled about why the _classical_ Mahalanobis distances that I get
 using
 the {stats} mahalanobis() function do not match the distances I get
 from the
 {robustbase} covMcd() function. Here is an example:
 
 x - matrix(rnorm(10*3), ncol = 3)
 
 #here is the {stats} result:
 Sx - cov(x)
 D2 - mahalanobis(x, colMeans(x), Sx)
 D2
 
 [1] 1.5135795 1.3761046 1.0367444 1.8111585 4.3038621 5.3195918
 3.2798665
 5.7559301
  [9] 2.2172150 0.3859475
 
 
 #here is the {robustbase} result
 Library(robustbase)
 D2rb- covMcd(x)
 D2rb$raw.mah
 
 [1] 0.7737193 1.1177445 0.7290794 0.6275703 3.5517622 6.0334350
 1.0582663
 5.7169250
  [9] 0.9420184 0.4210470
 
 According to the help file for covMcd{robustbase}
 
 raw.mah   mahalanobis distances of the observations based on the raw
 estimate of
 the location and scatter.
 
 So I think the second set of numbers should match the first. But they
 do not.
 What am I missing here?
 
 Thanks, Fraser
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating sparse matrix of type dgCMatrix directly

2012-07-28 Thread Søren Højsgaard


-Original Message-
From: dmba...@gmail.com [mailto:dmba...@gmail.com] On Behalf Of Douglas Bates
Sent: 28. juli 2012 20:36
To: Søren Højsgaard
Cc: r-help@r-project.org
Subject: Re: [R] Creating sparse matrix of type dgCMatrix directly

On Sat, Jul 28, 2012 at 7:26 AM, Søren Højsgaard sor...@math.aau.dk wrote:
 I want to create a sparse matrix of type dgCMatrix using the Matrix 
 package (and the matrix must be of this type even if other more 
 compact representations may exist). I do

 library(Matrix)
 m1-Matrix(rep(1,4),nrow=2,ncol=2,sparse=T)
 m1
 2 x 2 sparse Matrix of class dsCMatrix

 [1,] 1 1
 [2,] 1 1

 To convert m1, I do
 as(m1, dgCMatrix)
 2 x 2 sparse Matrix of class dgCMatrix

 [1,] 1 1
 [2,] 1 1

 Is it possible to construct a dgCMatrix directly i.e. without going through 
 the additional as() step above?

The Matrix function is a high-level function that is designed to produce a 
compact and informative representation of its arguments.
That's why it checks for symmetry, triangularity, etc.

To decide how to bypass these checks it would be useful to know what you are 
starting with.  Will it be a dense matrix or a triplet representation or ...?

[] Dear Doug,
[] I use sparse matrices to represent graphs so the matrices contain zeros 
and non-zeros (in most cases ones). I need the matrices to be of the dgCMatrix 
type because these matrices can be used as sparse matrices in the RcppEigen 
package. For now I have created my own dgCMatrix() function doing what I've 
sketched above, but there is - as you also indicate - an overhead in the checks 
in the Matrix() function, and also an overhead in using as().
Regards
Søren

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to extract unique indices for time series Data?

2012-07-28 Thread Hasan Diwan

  Here, I guess there are some duplicated dates-index. Is there any
 function available to automatically extract unique indices ???


?zooreg should do what you want, if I'm understanding the question properly.
-- 
Sent from my mobile device
Envoyait de mon portable

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] use R remotely (ssh / shell / emacs)

2012-07-28 Thread Martin Batholdy
Hi,

I have a laptop and a desktop PC.
Now I was wondering, if it is possible to create a setup in which you can use 
the laptop (where ever you are) to remotely access the PC, open an R-instance 
and let the PC do the heavy computation.

I have no idea about remote control, wake-on-lan, ssh and all these things.
I just wanted to know if someone here on the list is actually working with such 
a setup and if, once all is configured, it works reliable.

how much work it is to set it up with an editor like emacs?
And for which buzzwords should I search the internet?


sorry if that is the wrong mailing-list for this kind of questions,
I just hope to get some input on the topic.

thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] readRDS, In as.double.xts(fishReport$count) : NAs introduced by coercion

2012-07-28 Thread Yolande Tra
Hello,

I looked in the R-help but could not find an archive addressing the
following. I would like to convert a character to numeric after reading a
file with RDS extension. After using as.numeric, I checked if it is
numeric. It was not converted. Please help.

Here is my code

Report - readRDS(file=RDS/Report.RDS)
 Report[1:2,]
dive_id date time  species count
size
2008-08-06 08:49:00  108/06/2008 8:49:00 S. OYT15
6
2008-08-06 08:49:00  108/06/2008 8:49:00 S. atrovirens 1
23
site  depth level TRANSECT VIS_M TEMP_C swell_URSKI
2008-08-06 08:49:00 Hopkins 15  B   1  3.5 13.9 1.0686708
2008-08-06 08:49:00 Hopkins 15  B   1  3.5 13.9 1.0686708
 Report$count-as.numeric(Report$count)
Warning message:
In as.double.xts(fishReport$count) : NAs introduced by coercion
is.numeric(Report$count)
[1] FALSE
Thank you,
Y

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] use R remotely (ssh / shell / emacs)

2012-07-28 Thread Jeff Newmiller
Unix-ish operating systems often have X-Windows installed for GUI programs.  In 
this environment, it is very common to log in remotely and run GUI programs on 
the remote computer that open windows on your local screen.

It is also possible to run a vnc or Remote Terminal service/daemon on the 
remote computer so that you can see a copy of the GUI screen of the remote 
computer on your local computer. (RT is much more common on Microsoft Windows 
remote computers, vnc is more operating-system agnostic.)

R runs fine either way, but each approach has its own quirks and advantages.

Setting such software up is definitely off topic here.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Martin Batholdy batho...@googlemail.com wrote:

Hi,

I have a laptop and a desktop PC.
Now I was wondering, if it is possible to create a setup in which you
can use the laptop (where ever you are) to remotely access the PC, open
an R-instance and let the PC do the heavy computation.

I have no idea about remote control, wake-on-lan, ssh and all these
things.
I just wanted to know if someone here on the list is actually working
with such a setup and if, once all is configured, it works reliable.

how much work it is to set it up with an editor like emacs?
And for which buzzwords should I search the internet?


sorry if that is the wrong mailing-list for this kind of questions,
I just hope to get some input on the topic.

thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] selecting a subset of files to be processed

2012-07-28 Thread Erin Hodgess
Thanks so much!


On Sat, Jul 28, 2012 at 1:32 PM, Ted Harding ted.hard...@wlandres.net wrote:
 And, in addition to the tip from Rui (and similar from Joshua) below,
 I would advise that there is one good reason not to try doing it
 in pure Linux.

 The only source (that I know of) in Linux itself for random numbers
 can be tapped by something like

   cat /dev/random  filename

 /dev/random stores noise generated by the timings of system events
 (keyboard presses, mouse-clicks, disk accesses, interrupts, etc.)
 after subjecting them to a high-entropy stirring process. See:

   man random

 It yields them in the form of random bytes (each of 8 random 0/1 bits)
 and you would have to devise some means of coverting those onto a
 form suitable for accessing a directory listing at random. Not a
 pretty task!

 There is also the command 'rand' available in the openSSL toolkit,
 but that still outputs the results in the same format as /dev/random.

 If you really want to do this outside R, the I would suggest writing
 a little C program (to be run from the Linux command line). C can
 do its own random number generation, with results returned as
 real (double), and then apply these to select at random from the
 contents of a file generated by something like

   ls filesdir  filelist.txt

 and output the random selection.

 Ted.

 On 28-Jul-2012 18:00:38 Rui Barradas wrote:
 Hello,

 If the files are to be processed in R select a random sample in R.
 Using list.files() you can assign a character vector with the filenames
 of interest and then sample from that vector.

 ?list.files
 filenames - list.files(path, pattern)

 rand.sampl - sample(filenames, 45)

 Hope this helps,

 Rui Barradas

 Em 28-07-2012 18:49, Erin Hodgess escreveu:
 Dear R People:

 I am using a Linux system in which I have about 3000 files.

 I would like to randomly select about 45 of those files to be processed in
 R.

 Could I make the selection in R or should I do it in Linux, please?

 This is with R-2.15.1.

 Thanks,
 erin



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 -
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Date: 28-Jul-2012  Time: 19:32:26
 This message was sent by XFMail
 -



-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] use R remotely (ssh / shell / emacs)

2012-07-28 Thread Ranjan Maitra
I don't know if using R over ssh/remotely is off-topic, but the user
needs to provide more information. I am not knowledgeable about
wakeup-on-lan (isn't this set in the BIOS?) but I think it would be
helpful to know OS's, etc. How much more powerful is the desktop than
the laptop? If not, does it make sense to clutter b/w? Also, what does
the user envisage doing in the R simulations? I am assuming that this
will be run in the batch mode otherwise there does not seem to be much
point.

Ranjan

On Sat, 28 Jul 2012 19:46:57 -0700 Jeff Newmiller
jdnew...@dcn.davis.ca.us wrote:

 Unix-ish operating systems often have X-Windows installed for GUI programs.  
 In this environment, it is very common to log in remotely and run GUI 
 programs on the remote computer that open windows on your local screen.
 
 It is also possible to run a vnc or Remote Terminal service/daemon on the 
 remote computer so that you can see a copy of the GUI screen of the remote 
 computer on your local computer. (RT is much more common on Microsoft Windows 
 remote computers, vnc is more operating-system agnostic.)
 
 R runs fine either way, but each approach has its own quirks and advantages.
 
 Setting such software up is definitely off topic here.
 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 ---
 Sent from my phone. Please excuse my brevity.
 
 Martin Batholdy batho...@googlemail.com wrote:
 
 Hi,
 
 I have a laptop and a desktop PC.
 Now I was wondering, if it is possible to create a setup in which you
 can use the laptop (where ever you are) to remotely access the PC, open
 an R-instance and let the PC do the heavy computation.
 
 I have no idea about remote control, wake-on-lan, ssh and all these
 things.
 I just wanted to know if someone here on the list is actually working
 with such a setup and if, once all is configured, it works reliable.
 
 how much work it is to set it up with an editor like emacs?
 And for which buzzwords should I search the internet?
 
 
 sorry if that is the wrong mailing-list for this kind of questions,
 I just hope to get some input on the topic.
 
 thanks!
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
-- 
Important Notice: This mailbox is ignored: e-mails are set to be
deleted on receipt. For those needing to send personal or professional
e-mail, please use appropriate addresses.


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] quantreg Wald-Test

2012-07-28 Thread stefan23
Dear all,
I know that my question is somewhat special but I tried several times to
solve the problems on my own but I am unfortunately not able to compute the
following test statistic using the quantreg package. Well, here we go, I
appreciate every little comment or help as I really do not know how to tell
R what I want it to do^^
My situation is as follows: I have a data set containing a (dependent)
vector Y and the regressor X. My aim is to check whether the two variables
do not granger-cause each other in quantiles. I started to compute via
quantreg for a single tau:= q:
rq(Y_t~Y_(t-1)+Y_(t-2)+...+X_(t-1)+X_(t-2)+...,tau=q) 
This gives me the quantile regression coefficients. Now I want to check
whether all the coefficients of X are equal to zero (for this specific tau).
Can I do this by applying rq.anova ? I have already asked a similiar
question but I am not sure if anova is really calculating this for me..
Currently I am calculating
fitunrestricted=rq(Y_t~Y_(t-1)+Y_(t-2)+...+X_(t-1)+X_(t-2)+...,tau=q) 
fitrestrited=rq(Y_t~Y_(t-1)+Y_(t-2)+...,tau=q)
anova(fitrestricted,fitunrestricted)
If this is correct can you tell me how the test value is calculated in this
case, or in other words:
My next step is going to replicate this procedure for a whole range of
quantiles (say for tau in [a,b]). To apply a sup-Wald-test I am wondering if
it is correct to choose the maximum of the different test values and to
simulate the critical values by using the data tabulated in Andrees(1993)
(or simulate the vectors of independent Brownian Motions)...please feel free
to comment, I am really looking forward to your help!
Thank you very much 
Cheers
Stefan




--
View this message in context: 
http://r.789695.n4.nabble.com/quantreg-Wald-Test-tp4638198.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with a regression - Dataset Workinghours

2012-07-28 Thread Giorgio Monti
I'm a student. I'm working on a research using the statistical program R
2.15.1.
Here's my problem: how i can do a regression considering only values over a
certain limit?
For example, considering the dataset Workinghour of the Ecdat package,
is possible to build a predictive model that express the probability that a
wife works more than 8 hours per day?
The dataset includes 3382 observation on the number of hours spent working
by wifes per year in USA.

hoursday=hours/240
index-which(hoursday=8)
hoursday[index]

As you see, I'm able to extract the values that in 'hoursday' (which is
hours/240 working days in one year) are  8,0 but obviously i can't do a
regression cause the extracted data are a subset of the entire dataset (955
observations), while the other variables, like age, occupation, income,
etc. are still complete(3382).

So i can't do:
lm = lm(hoursday[index] ~
income+age+education+unemp+child5+child13+child17+nonwhite+owned+mortgage+occupation)
In fact R gives me: Error in model.frame.default(formula =
hoursday[index] ~ income, drop.unused.levels = TRUE) : variable lengths
differ (found for 'income').

Can you help me?

Thank you.

Giorgio

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] producing a graph with glm poisson distributed respons count data and categorical independant variables

2012-07-28 Thread babs
Hi,

I do have replicates, but not many. The offsets are my the number of
replicates actually, i misled myself by thinking i could add the counts of
the replicates all up and then go further with the offset function. I 
decided to split up my experiments in order to interpret them, they were
actually from the beginning split up experiments. Because i thought i could
see if the bees reacted in a same manner to the different experiments i
thought i could analyse it with an interaction factor to reveal whether this
was true. But i want now want to analyse them apart so i can draw
conclusions from the both experiments apart.
I did the following  with the counts from p1, including the replicates so  i
have only the type of field-margin as a variable, as i was only interested
in this from the beginning:
 
mengsel count
C   39
C   38
A   79
A   96
A   278
D   15
D   15
B   322
B   449
B   262

a.data.p1-read.table(a.data.p1.txt,header=TRUE,sep=)
a.data.p1
fit.sat.a.p1-glm(count~mengsel,data=a.data.p1,family=poisson)
anova(fit.sat.a.p1,test=Chisq)
fit.main.a.p1-glm(count~1,data=a.data.p1,family=poisson)
anova(fit.sat.a.p1,fit.main.a.p1,test=Chisq)
extractAIC(fit.sat.a.p1)
extractAIC(fit.main.a.p1)
summary(fit.sat.a.p1)

This tells me that the saturated model is better explaning then the other
so:

Call:
glm(formula = count ~ mengsel, family = poisson, data = a.data.p1)

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-6.4531  -3.7799  -0.0404   0.0603   9.2385  

Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)  5.017280.04698  106.79   2e-16 ***
mengselB 0.824330.05635   14.63   2e-16 ***
mengselC-1.366620.12327  -11.09   2e-16 ***
mengselD-2.309230.18852  -12.25   2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 1385.58  on 9  degrees of freedom
Residual deviance:  202.01  on 6  degrees of freedom
AIC: 273.15

Number of Fisher Scoring iterations: 5

How can i produce a graph for this?

I am worried that i still do not have enough replicates to actually draw
descent conclusions or conclusions at all...in my second experiment i do not
have replicates  for 2 out of four types of field margins, for the other two
i only have 2 replicates. 

this being the counts from the second experiment:

C   3   p2
C   1   p2
A   90  p2
A   29  p2
D   0   p2
B   157 p2

thanks,

babs



--
View this message in context: 
http://r.789695.n4.nabble.com/producing-a-graph-with-glm-poisson-distributed-respons-count-data-and-categorical-independant-variabs-tp4638110p4638192.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using save() to work with objects that exceed memory capacity

2012-07-28 Thread David Romano
Context:  I'm relatively new to R and am working with very large datasets.

General problem:  If working on a dataset requires that I produce more than
two objects of roughly the size of the dataset, R quickly uses up its
available memory and slows to a virtual halt.

My tentative solution:  To save and remove objects as they're created, and
load them when I need them.  To do this I'm trying to automatically
generate file names derived from these objects, and use these in save().

My specific question to the list:  How do I capture the string that names
an object I want to save, in such a way that I can use it in a function
that calls save()?

For example, suppose I create a matrix and then save it follows:
 mat-matrix(1:9,3,3)
 save(mat, file=matfile)
Then I get a file of the kind I'd like: the command 'load(matfile)'
retrieves the correct matrix, with the original name 'mat'.

Further, if I instead save it this way:
 objectname-mat
 save(list=ls(pattern=objectname), file=matfile)
then I get the same positive result.

But now suppose I create a function
 saveobj - function(objectname,objectfile)
+   {
+ save(list=ls(pattern=objectname),file=objectfile);
+ return()};
Then if I now try to save 'mat' by
 matname-mat
 saveobj(matname,matfile)
I do not get the same result; namely, the command 'load(mat)' retrieves
no objects.  Why is this?


I'd be grateful for any help on either my specific questions, or
suggestions of a better ways to address the issue of limited memory.

Thanks,
David Romano

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] garch external regressors

2012-07-28 Thread saraberta
Hi everyone,
i have a terribly stupid question, but i hope someone could reply me! when
i'm fitting a garch model and i want to insert in the variance model some
external regressors as temporal dummy or lags of other variables and so on,
which timeline should follow the time series that i select? in other words,
if the dependent variable is at time t, the external regressors in the
ugarchspec formula should be at time t or t-1?
thanks a lot
sara



--
View this message in context: 
http://r.789695.n4.nabble.com/garch-external-regressors-tp4638190.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] metafor package, proportions: single groups wrt to a categorical dependent variable

2012-07-28 Thread Dushanthi Pinnaduwage

 Dear Michael, Thanks very much for your quick response. My outcome is not 
binary. Outcome has more than 2 values (levels) and I have counts for the 
levels. I have not written a code yet.  I would like to know if escalc() can 
handle categorical outcome data (more than 2 proportions). If so, how can I 
input the counts of each level of the categorical outcome to this function. 
Hope it is little clear now.  Dushanthi   You do not give us very much to go on 
here. When you say your outcome 
is proportions do you mean your outcome has more than two values and 
 you are generating (or have been provided with) several proportions 
 for each study or do you mean it is a binary variable? If the latter 
 do you in fact have the numerators and denominators or just the 
 proportion? It would also help if you showed us the code you have 
 tried and the error message you got (if any) or told us the 
 discrepancy between the output you obtained and that which you expected. 
  Date: Sat, 28 Jul 2012 14:25:08 +0100
 To: dushan...@bell.net; r-help@r-project.org
 From: i...@aghmed.fsnet.co.uk
 Subject: Re: [R] metafor package, proportions: single groups wrt to a  
 categorical dependent variable
 
 At 01:44 28/07/2012, Dushanthi Pinnaduwage wrote:
 
 
 
 
 Dear all, I am using R version 2.15.0 and 'metafor' package version 
 1.6-0. Can this version of the package handle proportions from 
 a  categorical dependent variable for single studies?If so how do I 
 set up my dataframe for the raw data from different studies? Also 
 how do I give inputs, specially xi, mi (or ni) to the function 
 escalc()? Thanks,Dushanthi
 
 You do not give us very much to go on here. When you say your outcome 
 is proportions do you mean your outcome has more than two values and 
 you are generating (or have been provided with) several proportions 
 for each study or do you mean it is a binary variable? If the latter 
 do you in fact have the numerators and denominators or just the 
 proportion? It would also help if you showed us the code you have 
 tried and the error message you got (if any) or told us the 
 discrepancy between the output you obtained and that which you expected.
 
 
  [[alternative HTML version deleted]]
 
 Michael Dewey
 i...@aghmed.fsnet.co.uk
 http://www.aghmed.fsnet.co.uk/home.html
 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] low resolution word map

2012-07-28 Thread Thomas Steiner
Hi,
I'd like to have a low resolution word map in R.
The maps package has this option, but if I use the argument, the map
looses sense: Russia and Australia get empty etc

library(maps)
m=map(col=skyblue,fill=TRUE,plot=TRUE,resolution=10)
length(m$x)

If I drop the fill=TRUE, the effect of resolutaion=1 is lost,
ie no change. Is there any other package or could I use the resolution
argument differently?
Thanks,
Thomas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm without intercept

2012-07-28 Thread citynorman
I've just picked up R (been using Matlab, Eviews etc) and I'm having the same
issue. Running reg=lm(ticker1~ticker2)  gives R^2=50% while running
reg=lm(ticker1~0+ticker2) gives R^2=99%!! The charts suggest the fit is
worse not better and indeed Eviews/Excel/Matlab all say R^2=15% with
intercept=0. How come R calculates a totally different value?!

Call:
lm(formula = ticker1 ~ ticker2)

Residuals:
 Min   1Q   Median   3Q  Max 
-0.22441 -0.03380  0.01099  0.04891  0.16688 

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  1.570620.08187   19.18   2e-16 ***
ticker2  0.617220.02699   22.87   2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.07754 on 530 degrees of freedom
Multiple R-squared: 0.4967, Adjusted R-squared: 0.4958 
F-statistic: 523.1 on 1 and 530 DF,  p-value:  2.2e-16

Call:
lm(formula = ticker1 ~ 0 + ticker2)

Residuals:
  Min1QMedian3Q   Max 
-0.270785 -0.069280 -0.007945  0.087340  0.268786 

Coefficients:
Estimate Std. Error t value Pr(|t|)
ticker2 1.134508   0.001441   787.2   2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.1008 on 531 degrees of freedom
Multiple R-squared: 0.9991, Adjusted R-squared: 0.9991 
F-statistic: 6.197e+05 on 1 and 531 DF,  p-value:  2.2e-16  


Jan private wrote
 
 Hi,
 
 thanks for your help. I'm beginning to understand things better.
 
 If you plotted your data, you would realize that whether you fit the
 'best' least squares model or one with a zero intercept, the fit is
 not going to be very good
 Do the data cluster tightly around the dashed line?
 No, and that is why I asked the question. The plotted fit doesn't look
 any better with or without intercept, so I was surprised that the
 R-value etc. indicated an excellent regression (which I now understood
 is the wrong interpretation).
 
 One of the references you googled suggests that intercepts should never
 be omitted. Is this true even if I know that the physical reality behind
 the numbers suggests an intercept of zero?
 
 Thanks,
   Jan
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



--
View this message in context: 
http://r.789695.n4.nabble.com/lm-without-intercept-tp3312429p4638204.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mixed-model with paired design covariates

2012-07-28 Thread Nathan Ranc
Dear all,

I make habitat suitability models for animal species. The purpose of my
research is to investigate the accuracy of different models.

I clearly have a nested design:

- accuracy_measure - response variable
- 2 model types (model_type) - fixed effect
- 230 species (species) - random effect
- 10 replicates/species (replicate) - random effect
- 10 subreplicate/replicate - observation

So I have: 10*10*230 observations/model, identified as
speciesID_replicate_subreplicate (species_ID ranging 1:230, replicate 1:10
and subreplicate 1:10)

One could think about such mixed-effect model:

my.model-lme(fixed = accuracy_measure~model_type, random =
~1|species/replicate)

I do not expand here into model simplification nor if it is best to use lme
or lmer, YET... but here are more conceptual questions

1) my replicates  subreplicates are paired in the sense that they come
from the same split of the data. As an example for species X, the 20
observations of replicate1 of model A and B (10 for A and 10 for B) are
linked by a same data split which is likely to influence my accuracy
measure. In the same way the 2 observations replicate1_subreplicate2 of
model A and B (1 for A and 1 for B) are also linked. Is there a way to
introduce such pairing in a mixed effect model?

2) several continuous covariates, attributes of species (number of points
for modelization, size in km2 of the range) may influence the measures of
accuracy and I may be interested in investigating those effects. How could
I include a covariate in such model? How should I strucutre it given that
there are species covariates (high order of nesting)?

I hope that my questions are relevant!
Thank you very much in advance for your help!


Nathan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Appending the Column names

2012-07-28 Thread namit
Hi Freinds,

I have two data frames X,Y. I want  to append both the data frames into one,
along with the columns names from both the data frames (it should look like
Z).

X:
SummaryG  Y R
  Acc   1212   13
  Bcc   1114   15
  Ccc   1315   16

Y:
   SummaryG  Y R
  Acc   1011   12
  Bcc   1312   11
  Ccc   1116   20



Result
--
Z:

Summary G  Y R
  Acc   1212   13
  Bcc   1114   15
  Ccc   1315   16
Summary  G  Y R
  Acc   1011   12
  Bcc   1312   11
  Ccc   1116   20


Can anyone help me on this.

Thanks in Advance.

Thanks,
Namit.



--
View this message in context: 
http://r.789695.n4.nabble.com/Appending-the-Column-names-tp4638207.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [rjson] toJSON serializes double as int

2012-07-28 Thread Erdal Karaca
I have this object:

 obj - list(test=1.0)
 obj
$test
[1] 1

 toJSON(obj)
[1] {\test\:1}

How do force the serialization of obj$test to be 1.0 instead of 1?
I.e. the output should be:

 toJSON(obj)
[1] {\test\:1.0}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] low resolution word map

2012-07-28 Thread Thomas Steiner
Hi,
I'd like to have a low resolution word map in R.
The maps package has this option, but if I use the argument, the map
looses sense: Russia and Australia get empty etc

library(maps)
m=map(col=skyblue,fill=TRUE,plot=TRUE,resolution=10)
length(m$x)

If I drop the fill=TRUE, the effect of resolutaion=1 is lost,
ie no change. Is there any other package or could I use the resolution
argument differently?
Thanks,
Thomas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Beta-Binomial Regression in R

2012-07-28 Thread Anamika Chaudhuri
Hi All:

I am trying to generate Beta-Binomial data with regressors using R. I have
used the following code to generate Beta-Binomial data.  Now I want to add
a covariate to the equation. I would then like to use the simulated data to
run the Beta-Binomial model with covariates on it. Appreciate any help.

set.seed(111)
k-20
n-60
x-NULL

p-rbeta(k,3,3)# so that the mean nausea rate is alpha/(alpha+beta)
for(i in 1:k)
x-cbind(x,rbinom(300,n,p[i]))

Thanks
Anamika

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Coloring Counties in a State Map

2012-07-28 Thread arcata
I am mapping U.S. state counties in individual states maps, e.g. A map of
Montana, with county lines. 
How do I fill specific counties with a different color?



--
View this message in context: 
http://r.789695.n4.nabble.com/Coloring-Counties-in-a-State-Map-tp4638218.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm without intercept

2012-07-28 Thread Joshua Wiley
Hi,

R actually uses a different formula for calculating the R square
depending on whether the intercept is in the model or not.

You may also find this discussion helpful:
http://stats.stackexchange.com/questions/7948/when-is-it-ok-to-remove-the-intercept-in-lm/

If you conceptualize R^2 as the squared correlation between the
oberserved and fitted values, it is easy to get:

summary(m0 - lm(mpg ~ 0 + disp, data = mtcars))
summary(m1 - lm(mpg ~ disp, data = mtcars))
cor(mtcars$mpg, fitted(m0))^2
cor(mtcars$mpg, fitted(m1))^2

but that is not how R calculates R^2.

Cheers,

Josh

On Sat, Jul 28, 2012 at 10:40 AM, citynorman citynor...@hotmail.com wrote:
 I've just picked up R (been using Matlab, Eviews etc) and I'm having the same
 issue. Running reg=lm(ticker1~ticker2)  gives R^2=50% while running
 reg=lm(ticker1~0+ticker2) gives R^2=99%!! The charts suggest the fit is
 worse not better and indeed Eviews/Excel/Matlab all say R^2=15% with
 intercept=0. How come R calculates a totally different value?!

 Call:
 lm(formula = ticker1 ~ ticker2)

 Residuals:
  Min   1Q   Median   3Q  Max
 -0.22441 -0.03380  0.01099  0.04891  0.16688

 Coefficients:
 Estimate Std. Error t value Pr(|t|)
 (Intercept)  1.570620.08187   19.18   2e-16 ***
 ticker2  0.617220.02699   22.87   2e-16 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Residual standard error: 0.07754 on 530 degrees of freedom
 Multiple R-squared: 0.4967, Adjusted R-squared: 0.4958
 F-statistic: 523.1 on 1 and 530 DF,  p-value:  2.2e-16

 Call:
 lm(formula = ticker1 ~ 0 + ticker2)

 Residuals:
   Min1QMedian3Q   Max
 -0.270785 -0.069280 -0.007945  0.087340  0.268786

 Coefficients:
 Estimate Std. Error t value Pr(|t|)
 ticker2 1.134508   0.001441   787.2   2e-16 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Residual standard error: 0.1008 on 531 degrees of freedom
 Multiple R-squared: 0.9991, Adjusted R-squared: 0.9991
 F-statistic: 6.197e+05 on 1 and 531 DF,  p-value:  2.2e-16


 Jan private wrote

 Hi,

 thanks for your help. I'm beginning to understand things better.

 If you plotted your data, you would realize that whether you fit the
 'best' least squares model or one with a zero intercept, the fit is
 not going to be very good
 Do the data cluster tightly around the dashed line?
 No, and that is why I asked the question. The plotted fit doesn't look
 any better with or without intercept, so I was surprised that the
 R-value etc. indicated an excellent regression (which I now understood
 is the wrong interpretation).

 One of the references you googled suggests that intercepts should never
 be omitted. Is this true even if I know that the physical reality behind
 the numbers suggests an intercept of zero?

 Thanks,
   Jan

 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 View this message in context: 
 http://r.789695.n4.nabble.com/lm-without-intercept-tp3312429p4638204.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.