date:20120113

[R] loops over regression models

2012-01-13 Thread Xu Jun

Dear R help listers,

I am trying to replicate results in Gelman and Hill's book (Chapter 3
in regressions and multilevel models). Below I estimated two models
(chp3.1 and chp3.3 in R codes) with the same data and dependent
variable but different independent variables. I have been using Stata
for quite a while, and I know I can use foreach to build a loop to
condense the codes (especially if I have a large number of models to
run).

In Stata, it would be something like:


// read in data
use kidiq, clear

// run two regression
reg kid_score mom_hs
reg kid_score mom_iq

// the next three lines are equivalent of the previous two lines
foreach var in mom_hs mom_iq {
 reg kid_score `var'
}
***


So I want to figure out how to use R to do this. Below are my codes:


library(foreign)
# read in stata data file
kidiq -data.frame(read.dta('kidiq.dta', convert.factor=FALSE))

# bivariate regressions
chp3.1 - lm(kid_score ~ mom_hs, data=kidiq)
summary(chp3.1)

chp3.3 - lm(kid_score ~ mom_iq, data=kidiq)
summary(chp3.3)

clist - c(mom_iq, mom_hs)

for (x in clist) {
  lm(kid_score ~ x, data = kidiq)

}
Error in model.frame.default(formula = kid_score ~ x, data = kidiq,
drop.unused.levels = TRUE) :
  variable lengths differ (found for 'x')
##

But I got an error message that says variable length differ. I tried
various ways to work around this, for example, I tried:

clist - c(mom_iq, mom_hs)

for (x in 1:length(clist)) {
  lm(kid_score ~ clist[x], data = kidiq)

}



 But none of these work. So I am wondering if anyone could give me
some hint. Thanks a lot

Jun Xu, PhD
Assistant Professor
Department of Sociology
Ball State University
Muncie, IN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset

2012-01-13 Thread 孟欣

Many thanks for your elaborated explaina.








At 2012-01-13 11:34:51,R. Michael Weylandt michael.weyla...@gmail.com wrote:
As Jorge noted, the fix is to use %in%: a fuller explanation of why
`==` didn't work is that it implicitly used vector recycling: look at

with(data, id == c(a, c))

implicitly, this expands to id == c(a,c, a, c) to get the
lengths to match. Obviously only the first elements work here.

But when you had c(a, d) it expanded to c(a,d, a, d) and
you get TRUE for the 1st and 4th slot. This, however, was just a lucky
coincidence. Had you used c(d, a) there would have been no
matches.

Anyways, definitely use %in% but hopefully this clarifies things.

Michael

On Thu, Jan 12, 2012 at 9:50 PM, Jorge I Velez jorgeivanve...@gmail.com 
wrote:
 Hi,

 Use %in% instead of ==.

 HTH,
 Jorge.-


 On Thu, Jan 12, 2012 at 9:36 PM, ÃÏÐÀ  wrote:

 Hi all
 I have a question about subset function.


  dat
  id x1 x2  x3
 1  a  1 11 111
 2  b  2 22 222
 3  c  3 33 333
 4  d  4 44 444


  subset(dat,id==c(a,c))
  id x1 x2  x3
 1  a  1 11 111

  subset(dat,id==c(a,d))
  id x1 x2  x3
 1  a  1 11 111
 4  d  4 44 444


 From the above, if I choose id=a,c, the result is wrong,but if I choose
 id=a,d, the result is right.


 What's the reason for it?




 Many thanks!




 My best


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset

2012-01-13 Thread 孟欣

Thanks£¡






ÔÚ 2012-01-13 10:51:14£¬Jorge I Velez jorgeivanve...@gmail.com Ð´µÀ£º
Hi,


Use %in% instead of ==.


HTH,
Jorge.-



On Thu, Jan 12, 2012 at 9:36 PM, ÃÏÐÀ  wrote:
Hi all
I have a question about subset function.


 dat
 id x1 x2  x3
1  a  1 11 111
2  b  2 22 222
3  c  3 33 333
4  d  4 44 444


 subset(dat,id==c(a,c))
 id x1 x2  x3
1  a  1 11 111

 subset(dat,id==c(a,d))
 id x1 x2  x3
1  a  1 11 111
4  d  4 44 444


From the above, if I choose id=a,c, the result is wrong,but if I choose 
id=a,d, the result is right.


What's the reason for it?




Many thanks!




My best


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] analytical solution of partial differential equation

2012-01-13 Thread ATANU

i am trying to solve a partial differential equation analytically(PDE) in R .
i have found some functions that do the stuff numerically. But that will not
meet my purpose. is there any function to solve PDE  analytically. please
help. 


--
View this message in context: 
http://r.789695.n4.nabble.com/analytical-solution-of-partial-differential-equation-tp4291618p4291618.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] loops over regression models

2012-01-13 Thread Petr PIKAL

Hi
 
 Dear R help listers,
 
 I am trying to replicate results in Gelman and Hill's book (Chapter 3
 in regressions and multilevel models). Below I estimated two models
 (chp3.1 and chp3.3 in R codes) with the same data and dependent
 variable but different independent variables. I have been using Stata
 for quite a while, and I know I can use foreach to build a loop to
 condense the codes (especially if I have a large number of models to
 run).
 
 In Stata, it would be something like:
 
 
 // read in data
 use kidiq, clear
 
 // run two regression
 reg kid_score mom_hs
 reg kid_score mom_iq
 
 // the next three lines are equivalent of the previous two lines
 foreach var in mom_hs mom_iq {
  reg kid_score `var'
 }
 ***
 
 
 So I want to figure out how to use R to do this. Below are my codes:
 
 
 library(foreign)
 # read in stata data file
 kidiq -data.frame(read.dta('kidiq.dta', convert.factor=FALSE))
 
 # bivariate regressions
 chp3.1 - lm(kid_score ~ mom_hs, data=kidiq)
 summary(chp3.1)
 
 chp3.3 - lm(kid_score ~ mom_iq, data=kidiq)
 summary(chp3.3)
 
 clist - c(mom_iq, mom_hs)
 
 for (x in clist) {
   lm(kid_score ~ x, data = kidiq)

use 
as.formula(paste(kid_score ~ , eval(x)))

as I understand x is unevaluated and you need to evaluate it inside your 
data.

And you also need to assign values of lm inside cycle or explicitly print 
them.

Regards
Petr 

 
 }
 Error in model.frame.default(formula = kid_score ~ x, data = kidiq,
 drop.unused.levels = TRUE) :
   variable lengths differ (found for 'x')
 ##
 
 But I got an error message that says variable length differ. I tried
 various ways to work around this, for example, I tried:
 
 clist - c(mom_iq, mom_hs)
 
 for (x in 1:length(clist)) {
   lm(kid_score ~ clist[x], data = kidiq)
 
 }
 
 
 
  But none of these work. So I am wondering if anyone could give me
 some hint. Thanks a lot
 
 Jun Xu, PhD
 Assistant Professor
 Department of Sociology
 Ball State University
 Muncie, IN
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] beanplot-Error: sample is too sparse to find TD

2012-01-13 Thread Enomis

Hi all,

Since two days I am trying to find a solution to be able to create
beanplots for my data. When I call the beanplot function the following
error appears:

 beanplot(y1 ~ x1, log=, what=c(1,1,1,0), ylim=c(0,1))
 Error in bw.SJ(x, method = dpi) : sample is too sparse to find TD

What is really strange: I have 32 different vectors and the problem occurs
for about 20 of them. All of the 32 vectors contain the same number of data
values (about 20.000), within the same range, and without any NA values.
The only thing which is slightly different sometimes is the distribution.

I also had a look at the source code of the function which causes the
error:

 TD - -TDh(cnt, b, n, d)
if (!is.finite(TD) || TD = 0)
stop(sample is too sparse to find TD)

 TDh - function(x, h, n, d) .C(R_band_phi6_bin, as.integer(n),
as.integer(length(x)), as.double(d), x, as.double(h),
u = double(1))$u

But this also doesn't really help me. Has anyone of you an idea what's
causing my problem and how I can avoid it? The only thing I found so far
was this thread:
http://r.789695.n4.nabble.com/bandwidth-estimation-using-bw-SJ-td851441.html

Greets,
Enomis

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help needed in interpreting linear models

2012-01-13 Thread mails

Dear members of the R-help list,

I have sent the email below to the R-SIG-ME list to ask for help in
interpreting some R output of fitted linear models.

Unfortunately, I haven't yet received any answers. As I am not sure if my
email was sent successfully to the mailing list I

am asking for help here:



Dear members of the R-SIG-ME list,


I am new to linear models and struggling with interpreting some of the R
output but hope to get some advice from here.

I created the following dummy data set:

scores - c(2,6,10,12,14,20)

weight - c(60,70,80,75,80,85)

height - c(180,180,190,180,180,180)

The scores of a game/match should be dependent on the weight of the player
but not on the height. 

For me the output of the following two linear models make sense:

 (lm1 - summary(lm(scores ~ weight)))

Call:
lm(formula = scores ~ weight)

Residuals:
   123456 
 1.08333 -1.41667 -3.91667  1.3  0.08333  2.8 

Coefficients:
Estimate Std. Error t value Pr(|t|)   
(Intercept) -38.083310.0394  -3.793  0.01921 * 
weight0.6500 0.1331   4.885  0.00813 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 2.661 on 4 degrees of freedom
Multiple R-squared: 0.8564, Adjusted R-squared: 0.8205 
F-statistic: 23.86 on 1 and 4 DF,  p-value: 0.008134 

 
 (lm2 - summary(lm(scores ~ height)))

Call:
lm(formula = scores ~ height)

Residuals:
 1  2  3  4  5  6 
-8.800e+00 -4.800e+00  1.377e-14  1.200e+00  3.200e+00  9.200e+00 

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  25.2000   139.6175   0.1800.866
height   -0.0800 0.7684  -0.1040.922

Residual standard error: 7.014 on 4 degrees of freedom
Multiple R-squared: 0.002703,   Adjusted R-squared: -0.2466 
F-statistic: 0.01084 on 1 and 4 DF,  p-value: 0.9221 

The p-value of the first output is 0.008134 which makes sense as scores and
weight have a high correlation

and therefore, the scores can be explained by the explanatory
variable/factor weight very well. Hence, the R-squared

value is close to 1. For the second example it also makes sense that the
p-value is almost 1 (p=0.9221) as there is

hardly any correlation between scores and height.

What is not clear to me is shown in my 3rd linear model which includes both
weight and height.

 (lm3 - summary(lm(scores ~ weight + height)))

Call:
lm(formula = scores ~ weight + height)

Residuals:
 1  2  3  4  5  6 
 1.189e+00 -1.946e+00 -2.165e-15  4.865e-01 -1.081e+00  1.351e+00 

Coefficients:
Estimate Std. Error t value Pr(|t|)   
(Intercept) 49.45946   33.50261   1.476  0.23635   
weight   0.713510.08716   8.186  0.00381 **
height  -0.508110.19096  -2.661  0.07628 . 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 1.677 on 3 degrees of freedom
Multiple R-squared: 0.9573, Adjusted R-squared: 0.9288 
F-statistic:  33.6 on 2 and 3 DF,  p-value: 0.008833 

It makes sense that the R-squared value is higher when one adds both
explanatory variables/factors to the linear model as 

the more variables are added the more variance is explained and therefore
the fit of the model will be better. However, I do NOT

understand why the p-value of height (Pr( | t |)  = 0.07628) is now almost
significant? And also, I do NOT understand why the overall

p-value of 0.008833 is less significant as compared to the one from model
lm1 which was p-value: 0.008134.

The p-value of weight being low (p=0.00381) makes sense as this factor
explains the scores very well.



After fitting the 3 models (lm1, lm2 and lm3) I wanted to compare model lm1
with lm3 using the anova function to check whether the factor height

significantly improves the model. In other words I wanted to check if adding
height to the model helps explaining the scores of the players.

The output of the anova looks as follows:

 lm1 - lm(scores ~ weight)
 
 lm2 - lm(scores ~ weight + height)
 
 anova(lm1,lm2)
Analysis of Variance Table

Model 1: scores ~ weight
Model 2: scores ~ weight + height
  Res.Df RSS Df Sum of Sq  F  Pr(F)  
1  4 28.  
2  3  8.4324  119.901 7.0801 0.07628 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

In my opinion the p-value should be almost 1 and not close to significance
(0.07) as we have seen from model lm2

height does not at all explain the scores. Here, I thought that a
significant p-value means that the factor height adds

significant value to the model.


I would be very grateful if anyone could help me in interpreting the R
output.

Best regards

 






--
View this message in context: 
http://r.789695.n4.nabble.com/Help-needed-in-interpreting-linear-models-tp4291670p4291670.html
Sent from the R help mailing list archive at Nabble.com.

[R] Latent class model with Polytomous Variable and Bootstrap

2012-01-13 Thread Kuen Bok lee

Hi.

I run a latent class analysis with polymotous variables.
Because of small sample size, I have to use bootstrap method in order to
select a proper model.
Is there any package or way that I can run a bootstrap method after runing
latent class model with polytomous variables?

Thank

KeunBok Lee
-- 
Keun Bok Leeï¼ì´ê·¼ë³µï¼
PhD Student
Department of Sociology
University of California
Berkeley, CA 94720, USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rho statistics for dinucleotide abundance from a sequence file

2012-01-13 Thread utpal

Hi all,

I have a sequence file (fasta format) and want to calculate the rho
statistics for dinucleotide abundance value on my data.. the code which I
use is (using seqinr library and current working directory)

/seq_info-read.fasta(gene.txt)/
/rho(seq_info[1],2)/
but it yields only the dinucleotides, not their rho values, i.e,
/ rho(seq_info[1],2)
aa ac ag at ca cc cg ct ga gc gg gt ta tc tg tt /

I will be grateful if anyone solve this.. I've also attached the sequence
file..
Thanks in advance..
Utpal
http://r.789695.n4.nabble.com/file/n4291676/gene.txt gene.txt 

--
View this message in context: 
http://r.789695.n4.nabble.com/rho-statistics-for-dinucleotide-abundance-from-a-sequence-file-tp4291676p4291676.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remoting ESS/R with tramp

2012-01-13 Thread Michael Albinus

p...@potis.org writes:

 When I move R and Rscript out of my path, as expected, I also 
 have problems. If I use tramp to connect to remote, open a shell
 there with M-x eshell, and type 'env' , my path is the path on 
 local.

This is the correct behaviour of eshell:

/scp:slbps0:/root $ which env
eshell/env is a compiled Lisp function in `esh-var.el'

If you want the environment of your remote eshell session, you shall
call /bin/env instead.

 Before starting Emacs, try adding /usr/local/R-2.14.0/bin 
 to your local path, e.g.:
  
 export PATH= $PATH:/usr/local/R-2.14.0/bin

With Tramp means, you could do

  (add-to-list 'tramp-remote-path /usr/local/R-2.14.0/bin)

Alternatively, you could instruct Tramp to preserve the path settings of
your remote account:

  (add-to-list 'tramp-remote-path 'tramp-own-remote-path)

Best regards, Michael.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to create stratified (cross-validation) partitions according to numerical features

2012-01-13 Thread Martin Guetlein

Hi all,

I want to fragment a dataset into k-cross-validation partitions
(folds). The content of the folds should be stratified, but not
according to a single (categorical) feature, but according to a range
of features (numeric, if possible numeric and categorical). Does
anybody know a way to do this?

I only found a way to do this for a single split (training-test split)
with the package sampling. I will paste the example code for the
training-test split below to make clear what I am looking for.

With best regards,
Martin

example code:

library(sampling)
data - as.matrix( iris[1:4] ) # skipping iris class column as this
method only works for numerical features, but thats ok
prob - 0.3 # probability to be selected into test set
samplecube(data, pik=rep(prob, times=nrow(data)), order=2)

[...]
QUALITY OF BALANCING
 TOTALS HorvitzThompson_estimators Relative_deviation
Sepal.Length  876.5   874.6667-0.20916524
Sepal.Width   458.6   458.-0.05814799
Petal.Length  563.7   563.-0.06504642
Petal.Width   179.9   178.6667-0.68556606
   [1] 0 1 0 0 1 0 0 0 1 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1
 [38] 0 0 1 0 1 1 0 0 0 1 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0
 [75] 0 0 1 1 1 0 0 0 0 0 0 0 1 1 1 0 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 1
[112] 0 0 0 1 0 0 1 0 1 0 0 0 0 1 0 1 0 1 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1
[149] 0 0

-- 
Dipl-Inf. Martin Gütlein
Phone:
+49 (0)761 203 7633 (office)
+49 (0)177 623 9499 (mobile)
Email:
guetl...@informatik.uni-freiburg.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] syntax for reading into R

2012-01-13 Thread Marion Wenty

we haven't had time to create the file again, yet, but thanx also for the
great tip about this special mailing list!

marion

2012/1/3 Barry Rowlingson b.rowling...@lancaster.ac.uk

 On Tue, Jan 3, 2012 at 10:54 AM, Marion Wenty marion.we...@gmail.com
 wrote:
  hello barry,
 
  thank you very much for your help!
 
  we managed to do what you said and it basicaly worked - there only is a
  problem with our file, but we will create the file again and then it
 should
  work completely. i will keep you posted how it turned out. thanks again!

  Great - you might get more help from the R-sig-geo mailing list,
 where we tend to talk about spatial data a bit more than on R-help!

 Barry


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help needed in interpreting linear models

2012-01-13 Thread Petr PIKAL

Hi

It seems to me quite like a homework for which the policy of this list is 
not to respond.
But far from being an expert in statistics I only express my opinion. It 
seems to me that your height variable behaves like a two level factor and 
the 190 value points to rather suspicious value in weight if I look at the 
plot

plot(scores, weight)

Regards
Petr


 Dear members of the R-help list,
 
 I have sent the email below to the R-SIG-ME list to ask for help in
 interpreting some R output of fitted linear models.
 
 Unfortunately, I haven't yet received any answers. As I am not sure if 
my
 email was sent successfully to the mailing list I
 
 am asking for help here:
 
 
 
 Dear members of the R-SIG-ME list,
 
 
 I am new to linear models and struggling with interpreting some of the R
 output but hope to get some advice from here.
 
 I created the following dummy data set:
 
 scores - c(2,6,10,12,14,20)
 
 weight - c(60,70,80,75,80,85)
 
 height - c(180,180,190,180,180,180)
 
 The scores of a game/match should be dependent on the weight of the 
player
 but not on the height. 
 
 For me the output of the following two linear models make sense:
 
  (lm1 - summary(lm(scores ~ weight)))
 
 Call:
 lm(formula = scores ~ weight)
 
 Residuals:
123456 
  1.08333 -1.41667 -3.91667  1.3  0.08333  2.8 
 
 Coefficients:
 Estimate Std. Error t value Pr(|t|) 
 (Intercept) -38.083310.0394  -3.793  0.01921 * 
 weight0.6500 0.1331   4.885  0.00813 **
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 
 Residual standard error: 2.661 on 4 degrees of freedom
 Multiple R-squared: 0.8564,   Adjusted R-squared: 0.8205 
 F-statistic: 23.86 on 1 and 4 DF,  p-value: 0.008134 
 
  
  (lm2 - summary(lm(scores ~ height)))
 
 Call:
 lm(formula = scores ~ height)
 
 Residuals:
  1  2  3  4  5  6 
 -8.800e+00 -4.800e+00  1.377e-14  1.200e+00  3.200e+00  9.200e+00 
 
 Coefficients:
 Estimate Std. Error t value Pr(|t|)
 (Intercept)  25.2000   139.6175   0.1800.866
 height   -0.0800 0.7684  -0.1040.922
 
 Residual standard error: 7.014 on 4 degrees of freedom
 Multiple R-squared: 0.002703,   Adjusted R-squared: -0.2466 
 F-statistic: 0.01084 on 1 and 4 DF,  p-value: 0.9221 
 
 The p-value of the first output is 0.008134 which makes sense as scores 
and
 weight have a high correlation
 
 and therefore, the scores can be explained by the explanatory
 variable/factor weight very well. Hence, the R-squared
 
 value is close to 1. For the second example it also makes sense that the
 p-value is almost 1 (p=0.9221) as there is
 
 hardly any correlation between scores and height.
 
 What is not clear to me is shown in my 3rd linear model which includes 
both
 weight and height.
 
  (lm3 - summary(lm(scores ~ weight + height)))
 
 Call:
 lm(formula = scores ~ weight + height)
 
 Residuals:
  1  2  3  4  5  6 
  1.189e+00 -1.946e+00 -2.165e-15  4.865e-01 -1.081e+00  1.351e+00 
 
 Coefficients:
 Estimate Std. Error t value Pr(|t|) 
 (Intercept) 49.45946   33.50261   1.476  0.23635 
 weight   0.713510.08716   8.186  0.00381 **
 height  -0.508110.19096  -2.661  0.07628 . 
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 
 Residual standard error: 1.677 on 3 degrees of freedom
 Multiple R-squared: 0.9573,   Adjusted R-squared: 0.9288 
 F-statistic:  33.6 on 2 and 3 DF,  p-value: 0.008833 
 
 It makes sense that the R-squared value is higher when one adds both
 explanatory variables/factors to the linear model as 
 
 the more variables are added the more variance is explained and 
therefore
 the fit of the model will be better. However, I do NOT
 
 understand why the p-value of height (Pr( | t |)  = 0.07628) is now 
almost
 significant? And also, I do NOT understand why the overall
 
 p-value of 0.008833 is less significant as compared to the one from 
model
 lm1 which was p-value: 0.008134.
 
 The p-value of weight being low (p=0.00381) makes sense as this factor
 explains the scores very well.
 
 
 
 After fitting the 3 models (lm1, lm2 and lm3) I wanted to compare model 
lm1
 with lm3 using the anova function to check whether the factor height
 
 significantly improves the model. In other words I wanted to check if 
adding
 height to the model helps explaining the scores of the players.
 
 The output of the anova looks as follows:
 
  lm1 - lm(scores ~ weight)
  
  lm2 - lm(scores ~ weight + height)
  
  anova(lm1,lm2)
 Analysis of Variance Table
 
 Model 1: scores ~ weight
 Model 2: scores ~ weight + height
   Res.Df RSS Df Sum of Sq  F  Pr(F) 
 1  4 28. 
 2  3  8.4324  119.901 7.0801 0.07628 .
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 
 In my opinion the p-value should be almost 1 and not close to 
significance
 (0.07) as

[R] how to find the number of iterations kmeans used to converge?

2012-01-13 Thread Rui Esteves

Dear all,

I need to know in which number of iterations the kmeans converge each
time I run it.
Any idea how to do it?

Thank you for your attention,
Rui

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] find inflexion point of discrete value list with R

2012-01-13 Thread Jonas Stein

  d2y - diff(dy)
  which(dy==0)  ## critical values
  sign(s2y)[which(dy==0)]  ## test for max/min/saddle
  which(d2y==0)   ## inflection points
 
 I would think that testing for d2y==0 would be akin to the error in
 numeric analysis warned about in FAQ 7.31. Seems unlikely that in real
 data that there would always be three points in a row with equal
 differences at a true inflection and even then, many of the ones you
 did find satisfying that criterion would not be in fact inflection
 points. Wouldn't it be better to fit a spline and then do your testing
 on the spline approximation?
 
 Counter-example:
  x=1:10
 y=c(1,2,3,5,7,10,13,16,20,24)
  dy - diff(y)
  d2y - diff(dy)
 which(d2y==0)
 [1] 1 3 5 6 8
 
 And actually the original data was a pretty good counter-example as well.


   The original post wasn't entirely clear, but I thought the data were
 indeed integers and that the discrete-state version of
 min/max/inflection point was indeed what was wanted.  Yes, if the
 underlying variable is continuous you might want to use splinefun(),
 with its deriv= argument, and uniroot(), to find maxima and minima.
 Might be a little tricky in general, although with an interpolation
 spline between a finite set of points you can at least deal with it
 exhaustively.

my real data is not limited to integer. Do you know a ready to use code
example for this?

Would it be a good idea to create a function and make it public to the
community? And if yes as single .R file, or as a library?

kind regards,

-- 
Jonas Stein n...@jonasstein.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Remove space from string

2012-01-13 Thread Vikram Bahure

Dear R users,

I have some trivial query.

I have a string, I want to remove space from the string.

For eg.

Input:
a -  Remove space 

Output required:
Removespace

I tried using str_trim but only removes end spaces. library(stringr).

Regards
Vikram

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Remove space from string

2012-01-13 Thread Gustaf Rydevik

gsub( ,,a)

/Gustaf

On Fri, Jan 13, 2012 at 12:24 PM, Vikram Bahure
economics.vik...@gmail.comwrote:

 Dear R users,

 I have some trivial query.

 I have a string, I want to remove space from the string.

 For eg.

 Input:
 a -  Remove space 

 Output required:
 Removespace

 I tried using str_trim but only removes end spaces. library(stringr).

 Regards
 Vikram

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gustaf Rydevik, M.Sci.
tel: +44(0)74 253 760 42
address:St John's hill 18/5  EH8 9UQ Edinburgh, UK
skype:gustaf_rydevik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] remoting ESS/R with tramp

2012-01-13 Thread Claudia Beleites

Tom,

what happens with:

(Emacs)
M-x ssh
t
(you should have the remote shell buffer now)
R
(once R is started)
M-x ess-remote
r

?



Claudia


-- 
Claudia Beleites
Spectroscopy/Imaging
Institute of Photonic Technology
Albert-Einstein-Str. 9
07745 Jena
Germany

email: claudia.belei...@ipht-jena.de
phone: +49 3641 206-133
fax:   +49 2641 206-399

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: Remove space from string

2012-01-13 Thread Petr PIKAL

Hi

 
 Dear R users,
 
 I have some trivial query.
 
 I have a string, I want to remove space from the string.
 
 For eg.
 
 Input:
 a -  Remove space 
 
 Output required:
 Removespace

It seems to be simple. Even myself with very poor knowledge of regexpr can 
suggest solution.

gsub( , , a)

Regards
Petr


 
 I tried using str_trim but only removes end spaces. library(stringr).
 
 Regards
 Vikram
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to find the number of iterations kmeans used to converge?

2012-01-13 Thread David Winsemius



On Jan 13, 2012, at 5:53 AM, Rui Esteves wrote:


Dear all,

I need to know in which number of iterations the kmeans converge each
time I run it.
Any idea how to do it?



Look at the help page (to see that it is not part of the returned  
object) and then look at the code (to see that the object returned  
from the .C() call is immediately checked to see if the number of  
iterations exceeded the maximum set by the user. Search for this code:


if (Z$iter  iter.max)

Then you should be able to see your way forward. You can either create  
a modified return object or you can insert a line that prints that  
value.


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] second order cone programmming, linear inequalities constraints

2012-01-13 Thread riccardo giacomelli

Hi all,
I can see two packages implementing second order cone programming in R:
CLSOCP and DWD. Both of them allow to specify only equality linear constraints.
Do you know if there is a library to solve this problem:
second order cone programming with inequality liner constraints

excluding  mosek and cplex.
thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Averaging within a range of values

2012-01-13 Thread doggysaywhat

Hello all.

I have two data frames.  
Group   Start  End
G1 200   700
G2 500   1000
G3 20003000
G4 40006000
G5 70008000


and

Pos C0  C1
200 0.9   0.6
500   0.8 0.8
800 0.9   0.7
1000  0.7   0.6
20000.6   0.4
25001.2 0.8
30000.6 1.5
35000.7 0.7
40000.8   0.8
45000.6 0.6
5000  0.9   0.9
55000.7   0.8
60000.8 0.7
65000.4 0.4
7000  0.5   0.8
75000.7   0.9
80000.9 0.5
85000.8 0.6
90000.9 0.8


I need to conditionally average all values in columns C0 and C1 based upon
the bins I defined in the first data frame.  For example, for the bin G1 in
the first dataframe, the values are 200 to 700 so i would average the value
at pos 200 (0.9) and 500 (0.8) for C0 and then perform the same thing for
C1.  

I can do this in excel with array formulas but I'm relatively new to R and
would like know if there is a function that will perform the same action.  I
don't know if this will help, but the excel array function I used was
average(if(range=start)*(range=end),range)).  Where the range is the
entire pos column.  

Initially I looked at the aggregate function.   I can use aggregate when I
give a single vector to be used for grouping such as (A,B,C) but I'm not
sure how to define grouping as the bin 200-500 and the second bin as
500-1000 etc. and use that as my grouping vector.   

Any help would be greatly appreciated.  


--
View this message in context: 
http://r.789695.n4.nabble.com/Averaging-within-a-range-of-values-tp4291958p4291958.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simulating stable VAR process

2012-01-13 Thread statquant2

Hello Paul 
Thanks for the answer but my point is not how to simulate a VAR(p) process
and check that it is stable.
My question is more how can I generate a VAR(p) such that I already know
that it is stable.

We know a condition that assure that it is stable (see first message) but
this is not a condition on coefficients etc...
What I want is 
generate say a 1000 random VAR(3) processes over say 500 time periods that
will be STABLE (meaning If I run stability() all will pass the test)

When I try to do that it seems that none of the VAR I am generating pass
this test, so I assume that the class of stable VAR(p) is very small
compared to the whole VAR(p) process.



--
View this message in context: 
http://r.789695.n4.nabble.com/simulating-stable-VAR-process-tp4261177p4291835.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] GLHT in multcomp: Two similar models, one doesn't work

2012-01-13 Thread gaiarrido

i ran this model 

 model2-glm(rojos~ageandsex+sector+season+sector:season,quasipoisson)

 glht(model2,linfct=mcp(ageandsex=Tukey))

 General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Linear Hypotheses:
 Estimate
M - H == 0 0.2898
SUB - H == 0  -0.2261
SUB - M == 0  -0.5159


I tried to do the same changing factor season (with 2 levels) for month
(with four levels), and i get an error and no results

 monthmodel2-glm(rojos~ageandsex+sector+month+sector:month,quasipoisson)

 glht(monthmodel2,linfct=mcp(ageandsex=Tukey))
Error en modelparm.default(model, ...) : 
  dimensions of coefficients and covariance matrix don't match


I understand nothing, Why this is happen.

I beg for your help


-
Mario Garrido Escudero
PhD student
Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola
Universidad de Salamanca
--
View this message in context: 
http://r.789695.n4.nabble.com/GLHT-in-multcomp-Two-similar-models-one-doesn-t-work-tp4291875p4291875.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] colored outliers

2012-01-13 Thread Geophagus

Hi Justin, 
it still does not work.
All points become red.

I use this skript with your modifications:

TOC_NI-read.csv2(C:/Users/hilliges/Desktop/Master/Daten/Statistik/TOC-NI.csv,
sep=;, dec=,, encoding=UTF-8)
circ-TOC_NI[order(TOC_NI$NI,decreasing=T),][1:4,] 
plot(NI~TOC,data=TOC_NI,col=blue, pch=16, xlim=c(0,450))
abline(lm(NI~TOC,data=TOC_NI),col = red,lwd=3)
points(NI~TOC,data=TOC_NI,col='red',pch=1,size=3) 

Maybe the the Sourcefile will help to solve the prob?
http://r.789695.n4.nabble.com/file/n4291954/TOC-NI.csv TOC-NI.csv 

Thank you so muich!
GeO






--
View this message in context: 
http://r.789695.n4.nabble.com/colored-outliers-tp4282207p4291954.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help needed in interpreting linear models

2012-01-13 Thread mails

Hi Petr,

thanks for your answer.

First of all it's not homework I am a student and need to analyse cancer
data using linear models.
I looked into that topic since a week now and still struggling in
interpreting some of the R output that is why
I was asking for help here.

I don't quite understand your answer because the 180/190 values belong to
height and not to weight. What do you want to 
show with plot(scores,weight). What I can see from the plot is that there is
a correlation between the two variables and 
therefore weight explains scores.

Regards

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-needed-in-interpreting-linear-models-tp4291670p4291894.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] apply transformation

2012-01-13 Thread santosh

Hello All,

I have the following dataset:

Year   2006   2007
Jan  Jan 0.0204 0.0065
Feb  Feb 0.0145 0.0082
Mar  Mar 0.0027 0.0122


 dput(d_tmp)
structure(list(Year = c(Jan, Feb, Mar), `2006` = c(0.0204,
0.0145, 0.0027), `2007` = c(0.0065, 0.0082, 0.0122)), .Names =
c(Year,
2006, 2007), row.names = c(Jan, Feb, Mar), class =
data.frame)


I am trying to use the apply function but the values seem to be
getting coerced to characters. I could recast in my function ... but I
suspect there should be an easier way.

I can always use a for loop to get the output I need but just
wondering if there a way to get the same using apply or some other
function ... (the number of years can be changing in my requirement)

My final output needs to be as follows:

Year20062006-Lbl20072007-Lbl
Jan 0.0204  '2.04%' 0.0065  '0.65%'
Feb 0.0145  '1.45%' 0.0082  '0.82%'
Mar 0.0027  '0.27%' 0.0122  '1.22%'

i.e.
 dput(d_final)
structure(list(Year = structure(c(2L, 1L, 3L), .Label = c(Feb,
Jan, Mar), class = factor), X2006 = c(0.0204, 0.0145, 0.0027
), X2006.Lbl = structure(c(3L, 2L, 1L), .Label = c('0.27%',
'1.45%', '2.04%'), class = factor), X2007 = c(0.0065, 0.0082,
0.0122), X2007.Lbl = structure(1:3, .Label = c('0.65%', '0.82%',
'1.22%'), class = factor)), .Names = c(Year, X2006,
X2006.Lbl,
X2007, X2007.Lbl), row.names = c(NA, -3L), class = data.frame)

Please advise.

Santosh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read.table as integer

2012-01-13 Thread Francisco


Hello,
I have a csv file with many variables, both characters and integers.
I would like to load it on R and do some operations on integer 
variables, the problem is that R loads the entire dataset considering 
all variables as characters, instead I would like that R makes the 
distinction between the two types, because there are too many variables 
to do:

x1-as.integer(x1)
x2-as.integer(x2)
x3-as.integer(x3)
...

I tried to specify read.table(... stringsAsFactors=FALSE) but it doesn't 
work.


Thanks,
Best Regards

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problems with plotCI

2012-01-13 Thread Lasse DSR-mail

Got problems with plotCI (plotrix)

I only want to plot the upper part of the error bar in my barplot

I had the exact same commands working two months ago

Now I wanted to change the legend and when I ran it again plotCI stopped
working. 

 

To me it seems like there must some bug in R

 

 

 

library(plotrix)

library (graphics)

 

x = matrix(c( 13.75516276,3.944604404,  15.02280537,

  80.27687015,91.35282561,  81.8232087, 

5.96796709,3.625017815,  3.202591556), nrow=3,byrow=T)

yx=matrix(c( 2.5,7.5,12.5,3.5,8.5,13.5,4.5,9.5,14.5), nrow=3,byrow=T)

ste-c(4.870993623,
1.139221564,2.70870722,5.789702998,2.770116512,5.4600946821,1.2926938771,1.2
881562951,1.996090108)

 

 

w-c(grey,light grey,white)

 

 

r-c(Leioproctus, Lasioglossum,Bombus)

barplot(x,ylab=Relative abundance (%, +SE),

   xlab=Land use, 

   col=c(grey,light grey,white), beside=TRUE,

   space=c(0,2),

   ylim=c(0,107),cex.lab=1.3 )

legend(x=3.5, y=108, box.lty=0, legend=r, fill=w)

 

box(bty='l' )

#original command, now not working # 

plotCI(x=yx,y=x,uiw=ste,liw=NA, pch=NA_integer_, col=black,
lwd=1,add=TRUE)

 

## why is this following command working

plotCI(x=yx,y=x,uiw=NA, liw=ste,pch=NA_integer_, col=black,
lwd=1,add=TRUE)

 

 

Cheers

Lasse Bech Jacobsen

Horticultural Master Student 

 

Copenhagen University

Faculty of Life Sciences

Department of Ecology and Zoology

 

+4526829470

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] colored outliers

2012-01-13 Thread Petr PIKAL

Hi

what do you want to achieve?

 
 Hi Justin, 
 it still does not work.
 All points become red.
 
 I use this skript with your modifications:
 
 
TOC_NI-read.csv2(C:/Users/hilliges/Desktop/Master/Daten/Statistik/TOC-NI.csv,
 sep=;, dec=,, encoding=UTF-8)
 circ-TOC_NI[order(TOC_NI$NI,decreasing=T),][1:4,] 
 plot(NI~TOC,data=TOC_NI,col=blue, pch=16, xlim=c(0,450))

Points are plotted as a small circles in blue, you can make the points 
bigger by let say cex=2

 abline(lm(NI~TOC,data=TOC_NI),col = red,lwd=3)

red line is plotted according to linear model

 points(NI~TOC,data=TOC_NI,col='red',pch=1,size=3) 

all points are newly plotted as red circles (and possibly overplot the 
former ones). There is no size parameter for plot command, therefore it is 
ommited.

see
?plot.default and ?par for available options.

You maybe want cex=3 to get those new points as bigger circles.

Regards
Petr


 
 Maybe the the Sourcefile will help to solve the prob?
 http://r.789695.n4.nabble.com/file/n4291954/TOC-NI.csv TOC-NI.csv 
 
 Thank you so muich!
 GeO
 
 
 
 
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/colored-
 outliers-tp4282207p4291954.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help needed in interpreting linear models

2012-01-13 Thread Petr PIKAL

I forgot to post it to r help.
Petr

Hi

 
 Hi Petr,
 
 thanks for your answer.
 
 First of all it's not homework I am a student and need to analyse cancer
 data using linear models.
 I looked into that topic since a week now and still struggling in
 interpreting some of the R output that is why
 I was asking for help here.
 
 I don't quite understand your answer because the 180/190 values belong 
to
 height and not to weight. What do you want to 
 show with plot(scores,weight). What I can see from the plot is that 
there is
 a correlation between the two variables and 
 therefore weight explains scores.

Yes, but as far as I remember (I do not keep mails so now I can not see 
the data you posted - Nabble is not available for me) I said that the 
height value 190 (which was unique, all others were 180 if I remember 
correctly) is pointing to scores/weight pair which is slightly out from 
the simple linear model

lm(scores~weight)

so it is kind of an outlier from the model. Therefore adding the variable 
(height) to the model improves it and therefore the height variable in the 
second model is slightly significant as you found from anova.

You can also inspect your models by

plot(predict(fit), y.variable)
abline(0,1)

The better is the model the more close are the points to 0,1 line. Of 
course you can use some more formal evaluation (residuals, hatvalues...) 
and you can find appropriate literature e.g. at CRAN web. Those two are my 
favourites, however there are plenty other sources.

Using R for Data Analysis and Graphics - Introduction, Examples and 
Commentary” by John Maindonald (PDF, data sets and scripts are available 
at JM's homepage). 
“Practical Regression and Anova using R” by Julian Faraway (PDF, data sets 
and scripts are available at the book homepage). 

Regards
Petr

 
 Regards
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Help-needed-
 in-interpreting-linear-models-tp4291670p4291894.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] analytical solution of partial differential equation

2012-01-13 Thread Ben Bolker

ATANU ata.sonu at gmail.com writes:

 
 i am trying to solve a partial differential equation analytically(PDE) in R .
 i have found some functions that do the stuff numerically. But that will not
 meet my purpose. is there any function to solve PDE  analytically. please
 help. 

  With extremely limited exceptions (D(), deriv()), R doesn't do
analytical solutions.  There is an interface to Yacas.  You might
try Mathematica, or Sage (search if you are looking for a free solution.
Good luck, though: solving anything but the simplest PDEs analytically
is very challenging ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] strange Sys.Date() side effect

2012-01-13 Thread Czerminski, Ryszard

Basically I want to create a string vector where first element is a
date.

 Note that if you supply c() with objects of different types (as you
have),
 the results will probably not be what you wanted.

This is the key. I assumed (not correctly)
that Sys.Date() generates a character

 Sys.Date()
[1] 2012-01-13

but as you point out:

 class(Sys.Date())
[1] Date
  
and it is the best to provide c() with arguments of the same type
so, in the end, explicitely casting Sys.Date() to character works for
me:

 N - 2; c(as.character(Sys.Date()), sprintf('N=%d', N))
[1] 2012-01-13 N=2

Thanks!

Ryszard


--
Confidentiality Notice: This message is private and may contain confidential 
and proprietary information. If you have received this message in error, please 
notify us and remove it from your system and note that you must not copy, 
distribute or take any action in reliance on it. Any unauthorized use or 
disclosure of the contents of this message is not permitted and may be unlawful.
 
-Original Message-
From: MacQueen, Don [mailto:macque...@llnl.gov] 
Sent: Thursday, January 12, 2012 5:27 PM
To: Czerminski, Ryszard; r-help@r-project.org
Subject: Re: [R] strange Sys.Date() side effect

My best guess is that you are misunderstanding what the c() function
does.
I'd suggest reading the help page for c, obtained by typing
  ?c
Note that if you supply c() with objects of different types (as you
have),
the results will probably not be what you wanted.

Given what c() does, your output and warning message make sense. But I'm
unable to figure out what you're really trying to do. Something like
this,
perhaps?

 N - 2; sprintf('N = %d', N)
[1] N = 2


-Don


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 1/10/12 5:04 AM, Czerminski, Ryszard
ryszard.czermin...@astrazeneca.com wrote:

Any ideas what is the problem with this code?

 N - 2; c(Sys.Date(), sprintf('N = %d', N))
[1] 2012-01-10 NA
Warning message:
In as.POSIXlt.Date(x) : NAs introduced by coercion

Best regards,
Ryszard

Ryszard Czerminski
AstraZeneca Pharmaceuticals LP
35 Gatehouse Drive
Waltham, MA 02451
USA
781-839-4304
ryszard.czermin...@astrazeneca.com


---
---
Confidentiality Notice: This message is private and may
...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.table as integer

2012-01-13 Thread David Winsemius



On Jan 13, 2012, at 7:02 AM, Francisco wrote:


Hello,
I have a csv file with many variables, both characters and integers.
I would like to load it on R and do some operations on integer  
variables, the problem is that R loads the entire dataset  
considering all variables as characters, instead I would like that R  
makes the distinction between the two types, because there are too  
many variables to do:

x1-as.integer(x1)
x2-as.integer(x2)
x3-as.integer(x3)
...

I tried to specify read.table(... stringsAsFactors=FALSE) but it  
doesn't work.


You need to use colClasses

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.table as integer

2012-01-13 Thread Gabor Grothendieck

On Fri, Jan 13, 2012 at 7:02 AM, Francisco franciscororol...@google.com wrote:
 Hello,
 I have a csv file with many variables, both characters and integers.
 I would like to load it on R and do some operations on integer variables,
 the problem is that R loads the entire dataset considering all variables as
 characters, instead I would like that R makes the distinction between the
 two types, because there are too many variables to do:
 x1-as.integer(x1)
 x2-as.integer(x2)
 x3-as.integer(x3)
 ...

 I tried to specify read.table(... stringsAsFactors=FALSE) but it doesn't
 work.

There must be non-integers in some of the columns that are supposed to
be integer.  Lets assume that the first row has no such garbage.  Then
we can get the desired classes from that row and apply it to the
entire data frame.  In this example the second column has such
garbage:

# test data
Lines - a,b,c
D,2,3
a,b,9
C,5,6

# read in just row 1 and read in all rows
DF1 - read.csv(text = Lines, nrow = 1, as.is = TRUE)
DF - DF0 - read.csv(text = Lines, as.is = TRUE)

# there will warning as its converting garbage to NAs
to.int - function(v, v1) if (inherits(v1, integer)) as.integer(v) else v
DF - mapply(to.int, DF0, DF1, SIMPLIFY = FALSE)
DF - as.data.frame(DF)

As we see here the second column becomes integer despite garbage in it:

 str(DF0) # as read in
'data.frame':   3 obs. of  3 variables:
 $ a: chr  D a C
 $ b: chr  2 b 5
 $ c: int  3 9 6
 str(DF) # as converted
'data.frame':   3 obs. of  3 variables:
 $ a: Factor w/ 3 levels a,C,D: 3 1 2
 $ b: int  2 NA 5
 $ c: int  3 9 6
-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] IF ELSE

2012-01-13 Thread jost87

Can somebody explain the problem in the following expression?

Thank you


/ if (species == 1){ 
+   fitness -
(1-b)*exp(-((microsites-niche.preference)/(niche.width.specialist+a.specialist)^2)*(1-a.specialist)
+ }else 
Error: unexpected '}' in:
  fitness -
(1-b)*exp(-((microsites-niche.preference)/(niche.width.specialist+a.specialist)^2)*(1-a.specialist)
}
 if (species ==2){
+ fitness )
(1-b)*exp(-((microsites-niche.preference)/(niche.width.generalist+a.generalist)^2)*(1-a.generalist)
Error: unexpected ')' in:
if (species ==2){
fitness )
 }   else fitness - 0 
Error: unexpected '}' in }
 }
Error: unexpected '}' in }/

--
View this message in context: 
http://r.789695.n4.nabble.com/IF-ELSE-tp4292285p4292285.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] function to replace values doesn't work on vectors

2012-01-13 Thread WoutH

 I've got a numeric vector with values ranging from 1 to 5, I would like to
catagorize these values like this:

1 becomes catagory 1
3 becomes catagory 3
And everything else in catagory 2. The simple function I wrote beneath works
for single numeric data, but for some reason I am unable to feed it vectors.
Any help would be appreciated, as I'm fairly new to R.



--
View this message in context: 
http://r.789695.n4.nabble.com/function-to-replace-values-doesn-t-work-on-vectors-tp4292235p4292235.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] IF ELSE

2012-01-13 Thread Joshua Wiley

Hi jost87,

Try using a good text editor with syntax highlighting and
parenthesis/brace matching (personal preference is Emacs + ESS, others
like Rstudio, and there are tons more).  I find at least one
mismatched pair of parentheses/braces in those expressions.

Cheers,

Josh

On Fri, Jan 13, 2012 at 6:31 AM, jost87 jos...@hotmail.com wrote:
 Can somebody explain the problem in the following expression?

 Thank you


 / if (species == 1){
 +                   fitness -
 (1-b)*exp(-((microsites-niche.preference)/(niche.width.specialist+a.specialist)^2)*(1-a.specialist)
 +     }else
 Error: unexpected '}' in:
                   fitness -
 (1-b)*exp(-((microsites-niche.preference)/(niche.width.specialist+a.specialist)^2)*(1-a.specialist)
    }
     if (species ==2){
 +         fitness )
 (1-b)*exp(-((microsites-niche.preference)/(niche.width.generalist+a.generalist)^2)*(1-a.generalist)
 Error: unexpected ')' in:
     if (species ==2){
        fitness )
         }   else fitness - 0
 Error: unexpected '}' in         }
     }
 Error: unexpected '}' in     }/

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/IF-ELSE-tp4292285p4292285.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I prevent solve.QP from printing the solution progress ?

2012-01-13 Thread Uzuner, Tolga I

Many thanks all,
Tolga

-Original Message-
From: Chibisi Chima-Okereke [mailto:cchima-oker...@mango-solutions.com] 
Sent: 12 January 2012 17:14
To: Uzuner, Tolga I
Cc: Jack Teall; Richard Pugh
Subject: RE: How can I prevent solve.QP from printing the solution progress ?

Try something like this ...

myFile - tempfile()

sink(file = myFile)
solve.Qp ()
sink()
file.remove(myFile) #or
unlink(myFile)

Kind regards

Chibisi

-Original Message-
From: Jack Teall 
Sent: 12 January 2012 16:49
To: Matt Aldridge; Richard Pugh; Consultants
Subject: FW: How can I prevent solve.QP from printing the solution progress ?

tolga.i.uzu...@jpmorgan.com

Jack Teall

T:+44 (0)1249 766811
F: +44 (0)1249 767707
www.mango-solutions.com 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Uzuner, Tolga I
Sent: 12 January 2012 16:46
To: r-help@r-project.org
Subject: [R] How can I prevent solve.QP from printing the solution progress ?

Dear R Users,
How can I prevent solve.Qp from printing the solution progress ?
Thanks in advance,
Tolga

This email is confidential and subject to important disclaimers and conditions 
including on offers for the purchase or sale of securities, accuracy and 
completeness of information, viruses, confidentiality, legal privilege, and 
legal entity disclaimers, available at 
http://www.jpmorgan.com/pages/disclosures/email.  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
LEGAL NOTICE\ This message is intended for the use of th...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Averaging within a range of values

2012-01-13 Thread Jeff Newmiller

Regarding your last question, read ?cut
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

doggysaywhat chwh...@ucsd.edu wrote:

Hello all.

I have two data frames.  
Group   Start  End
G1200   700
G2500   1000
G320003000
G440006000
G570008000


and

PosC0  C1
2000.9   0.6
500  0.8 0.8
8000.9   0.7
1000 0.7   0.6
2000   0.6   0.4
2500   1.2 0.8
3000   0.6 1.5
3500   0.7 0.7
4000   0.8   0.8
4500   0.6 0.6
5000 0.9   0.9
5500   0.7   0.8
6000   0.8 0.7
6500   0.4 0.4
7000 0.5   0.8
7500   0.7   0.9
8000   0.9 0.5
8500   0.8 0.6
9000   0.9 0.8


I need to conditionally average all values in columns C0 and C1 based
upon
the bins I defined in the first data frame.  For example, for the bin
G1 in
the first dataframe, the values are 200 to 700 so i would average the
value
at pos 200 (0.9) and 500 (0.8) for C0 and then perform the same thing
for
C1.  

I can do this in excel with array formulas but I'm relatively new to R
and
would like know if there is a function that will perform the same
action.  I
don't know if this will help, but the excel array function I used was
average(if(range=start)*(range=end),range)).  Where the range is the
entire pos column.  

Initially I looked at the aggregate function.   I can use aggregate
when I
give a single vector to be used for grouping such as (A,B,C) but I'm
not
sure how to define grouping as the bin 200-500 and the second bin as
500-1000 etc. and use that as my grouping vector.   

Any help would be greatly appreciated.  


--
View this message in context:
http://r.789695.n4.nabble.com/Averaging-within-a-range-of-values-tp4291958p4291958.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function to replace values doesn't work on vectors

2012-01-13 Thread Sarah Goslee

I don't see any signs of a simple function, but I'd use ifelse().

y - ifelse(x == 1, 1, ifelse(x == 3, 3, 2))

or some such. (Lack of reproducible example means lack of actual
testing.)

Sarah

On Fri, Jan 13, 2012 at 9:11 AM, WoutH w.denhollan...@lumc.nl wrote:
  I've got a numeric vector with values ranging from 1 to 5, I would like to
 catagorize these values like this:

 1 becomes catagory 1
 3 becomes catagory 3
 And everything else in catagory 2. The simple function I wrote beneath works
 for single numeric data, but for some reason I am unable to feed it vectors.
 Any help would be appreciated, as I'm fairly new to R.


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Troubles with stemming (tm + Snowball packages) under MacOS

2012-01-13 Thread Julien Velcin


Dear all,

I have some troubles using the stemming algorithm provided by the tm  
(text mining) + Snowball packages.

Here is my config:

MacOS 10.5
R 2.12.0 / R 2.13.1 / R 2.14.1 (I have tried several versions)

I have installed all the needed packages (tm, rJava, rWeka, Snowball)  
+ dependencies. I have desactivated AWT (like written in http://r.789695.n4.nabble.com/Problem-with-Snowball-amp-RWeka-td3402126.html) 
 with :


Sys.setenv(NOAWT=TRUE)

The command tm_map(reuters, stemDocument) gives the following errors :

- First time:
Error in .jnew(name) :
  java.lang.InternalError: Can't start the AWT because Java was  
started on the first thread.  Make sure StartOnFirstThread is not  
specified in your application's Info.plist or on the command line

Refreshing GOE props...

- Second time:
Stemmer 'porter' unknown!
Stemmer 'english' unknown!
Stemmer 'porter' unknown!
Stemmer 'english' unknown!
Stemmer 'porter' unknown!
Stemmer 'english' unknown!
Stemmer 'porter' unknown!
Stemmer 'english' unknown!
Stemmer 'porter' unknown!
Stemmer 'english' unknown!
(etc.)

I have already search the Web for a solution, but I have found nothing  
useful.


Here is the full source code (all the librairies are already loaded):
--
Sys.setenv(NOAWT=TRUE)
source - ReutersSource(reuters-21578.xml, encoding=UTF-8)
reuters - Corpus(source)
reuters - tm_map(reuters, as.PlainTextDocument)
reuters - tm_map(reuters, removePunctuation)
reuters - tm_map(reuters, tolower)
reuters - tm_map(reuters, removeWords, stopwords(english))
reuters - tm_map(reuters, removeNumbers)
reuters - tm_map(reuters, stripWhitespace)
reuters - tm_map(reuters, stemDocument)
--

Thank you for your help,

Julien

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Quantiles in boxplot‏

2012-01-13 Thread René Brinkhuis


Hi,

I have a simple question about quartiles in R, especially how they are 
calculated using the boxplot.
Quartiles
 (.25 and .75) in boxplot are different from the summary function and 
also don't match with the 9 types in the quantile function.
See attachment for details.
Can you give me the details on how the boxplot function does calculate these 
values?

Cheers,
Rene Brinkhuis (Netherlands)  

Quartiles in R.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] access/row access/col access

2012-01-13 Thread statquant2

Hello,
I have a data.frame and I want to transfor it in a list of rows or columns.
I can do apply(myDataFrame,MARGIN=1,FUN=???)

I remember that there is a function which mean return or access column ...
something like :: or ], or [,
I can't remember can somebody refresh my memory?

--
View this message in context: 
http://r.789695.n4.nabble.com/access-row-access-col-access-tp4292531p4292531.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quantiles in boxplot‏

2012-01-13 Thread Sarah Goslee

The explanation is in ?boxplot.stats (as the help for boxplot states).

Details:

 The two ‘hinges’ are versions of the first and third quartile,
 i.e., close to ‘quantile(x, c(1,3)/4)’.  The hinges equal the
 quartiles for odd n (where ‘n - length(x)’) and differ for even
 n.  Whereas the quartiles only equal observations for ‘n %% 4 ==
 1’ (n = 1 mod 4), the hinges do so _additionally_ for ‘n %% 4 ==
 2’ (n = 2 mod 4), and are in the middle of two observations
 otherwise.

And so on, with references.

Sarah

2012/1/13 René Brinkhuis rene.brinkh...@live.nl:

 Hi,

 I have a simple question about quartiles in R, especially how they are 
 calculated using the boxplot.
 Quartiles
  (.25 and .75) in boxplot are different from the summary function and
 also don't match with the 9 types in the quantile function.
 See attachment for details.
 Can you give me the details on how the boxplot function does calculate these 
 values?

 Cheers,
 Rene Brinkhuis (Netherlands)




-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] access/row access/col access

2012-01-13 Thread Sarah Goslee

I have to admit, I have very little idea what you are trying to do. Can you
provide an example?

In general, the i-th column of a data frame can be accessed with

mydataframe[, i]

but that doesn't help with whatever you want to do with apply().

Sarah

On Fri, Jan 13, 2012 at 10:45 AM, statquant2 statqu...@gmail.com wrote:
 Hello,
 I have a data.frame and I want to transfor it in a list of rows or columns.
 I can do apply(myDataFrame,MARGIN=1,FUN=???)

 I remember that there is a function which mean return or access column ...
 something like :: or ], or [,
 I can't remember can somebody refresh my memory?

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Odp: Remove space from string

2012-01-13 Thread Bos, Roger

I have NO knowledge of regexpr, but someone helped me out once and I put
it into a function I call trim.  Here is the line I use:

function(x) gsub(^[[:space:]]+|[[:space:]]+$, , x)

One more thing you can try.  Hope it helps, Roger

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Petr PIKAL
Sent: Friday, January 13, 2012 7:40 AM
To: Vikram Bahure
Cc: r-help@r-project.org
Subject: [R] Odp: Remove space from string

Hi

 
 Dear R users,
 
 I have some trivial query.
 
 I have a string, I want to remove space from the string.
 
 For eg.
 
 Input:
 a -  Remove space 
 
 Output required:
 Removespace

It seems to be simple. Even myself with very poor knowledge of regexpr
can suggest solution.

gsub( , , a)

Regards
Petr


 
 I tried using str_trim but only removes end spaces. library(stringr).
 
 Regards
 Vikram
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
***

This message is for the named person's use only. It may\...{{dropped:14}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plotting regression line in with lattice

2012-01-13 Thread matteo dossena

#Dear All, 
#I'm having a bit of a trouble here, please help me...
#I have this data
set.seed(4)
mydata - data.frame(var = rnorm(100),
 temp = rnorm(100),
 subj = as.factor(rep(c(1:10),5)),
 trt = rep(c(A,B), 50))

#and this model that fits them
lm  - lm(var ~ temp * subj, data = mydata)

#i want to plot the results with lattice anf fit the regression line, predicted 
with my model, trough them
#to do so, I'm using this approach, outlined  Lattice Tricks for the power 
useR by D. Sarkar

temp_rng - range(mydata$temp, finite = TRUE)

grid - expand.grid(temp = do.breaks(temp_rng, 30),
subj = unique(mydata$subj),
trt = unique(mydata$trt))

model - cbind(grid, var = predict(lm, newdata = grid))

orig - mydata[c(var,temp,subj,trt)]

combined - make.groups(original = orig, model = model)


xyplot(var ~ temp | subj, 
   data = combined,
   groups = which,
   type = c(p, l),
   distribute.type = TRUE
   )


# so far every thing is fine, but, i also whant assign a filling to the data 
points for the two treatments trt=1 and trt=2 
# so I have written this piece of code, that works fine, but when it comes to 
plot the regression line, it seems that type is not recognized by the panel 
function... 

my.fill - c(black, grey)

plot - with(combined,
xyplot(var ~ temp | subj,
  data = combined,
  group = combined$which,
  type = c(p, l),
  distribute.type = TRUE,
  panel = function(x, y, ..., subscripts){
 fill - my.fill[combined$trt[subscripts]] 
 panel.xyplot(x, y, pch = 21, fill = my.fill, col = 
black)
 },
 key = list(space = right,
 text = list(c(trt1, trt2), cex = 0.8),
 points = list(pch = c(21), fill = c(black, grey)),
 rep = FALSE)
 )
  )
plot

#I've also tried to move type and distribute type within panel.xyplot, as well 
as subsseting the data in it panel.xyplot like this

plot - with(combined,
xyplot(var ~ temp | subj,
  data = combined,
  panel = function(x, y, ..., subscripts){
 fill - my.fill[combined$trt[subscripts]] 
 panel.xyplot(x[combined$which==original], 
y[combined$which==original], pch = 21, fill = my.fill, col = black)
 panel.xyplot(x[combined$which==model], 
y[combined$which==model], type = l, col = black)
 },
 key = list(space = right,
 text = list(c(trt1, trt2), cex = 0.8),
 points = list(pch = c(21), fill = c(black, grey)),
 rep = FALSE)
 )
  )
plot

#but no success with that either...
#can anyone help me to get the predicted values plotted as a line instead of 
being points? 
#really appricieate
#matteo 





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function to replace values doesn't work on vectors

2012-01-13 Thread Berend Hasselman


WoutH wrote
 
 I've got a numeric vector with values ranging from 1 to 5, I would like to
 catagorize these values like this:
 
 1 becomes catagory 1
 3 becomes catagory 3
 And everything else in catagory 2. The simple function I wrote beneath
 works for single numeric data, but for some reason I am unable to feed it
 vectors. Any help would be appreciated, as I'm fairly new to R.
 
 function.123 - function(x)
 {
   x1 - ifelse(x == 1 | x == 3, return(x) ,return(2))
   return(x1)
 } 
 

Don't use the return() in the ifelse since return() returns immediately from
the function.
Just 

x1 - ifelse(x==1 | x==3, x, 2)

will do.

Berend


--
View this message in context: 
http://r.789695.n4.nabble.com/function-to-replace-values-doesn-t-work-on-vectors-tp4292235p4292691.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] access/row access/col access

2012-01-13 Thread Jeff Newmiller

http://cran.r-project.org/doc/manuals/R-intro.pdf
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

statquant2 statqu...@gmail.com wrote:

Hello,
I have a data.frame and I want to transfor it in a list of rows or
columns.
I can do apply(myDataFrame,MARGIN=1,FUN=???)

I remember that there is a function which mean return or access column
...
something like :: or ], or [,
I can't remember can somebody refresh my memory?

--
View this message in context:
http://r.789695.n4.nabble.com/access-row-access-col-access-tp4292531p4292531.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem Installing R to SuSE 10 via RPM

2012-01-13 Thread Matthew Pettis

Hi,

I'm trying to install R from an rpm locally to my account (the reason I'm
not doing it through yast/yast2/zypper is that the sys admin isn't yet
willing to install it, and doesn't want to support it, but will help me
support it if I install it locally -- in short, policy problems rather than
technical).  Below is the SuSE version, Kernel version, and rpm install
error I'm getting, as well as the error...

Can anyone help me with the error?  I'm trying to install R-base 2.14.1,
but it is telling me that I need R-base version 2.14.1 as a dependency.  Am
I using the wrong rpm for an installation starting from scratch?

I got the rpm from:
http://download.opensuse.org/repositories/devel:/languages:/R:/base/SLE_10/x86_64/

Thanks,
Matt




pettis@swat:~/bin cat /etc/*-release
SUSE Linux Enterprise Server 10 (x86_64)
VERSION = 10
PATCHLEVEL = 2

pettis@swat:~/bin uname -a
Linux swat 2.6.16.60-0.34-smp #1 SMP Fri Jan 16 14:59:01 UTC 2009 x86_64
x86_64 x86_64 GNU/Linux

pettis@swat:~/bin rpm -ivh R-base-devel-2.14.1-30.1.x86_64.rpm
warning: R-base-devel-2.14.1-30.1.x86_64.rpm: Header V3 DSA signature:
NOKEY, key ID 793371fe
error: Failed dependencies:
R-base = 2.14.1 is needed by R-base-devel-2.14.1-30.1.x86_64

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Nabble? Was Re: function to replace values doesn't work on vectors

2012-01-13 Thread Sarah Goslee

Interesting: The email I received through the R-help list didn't have
all the information that was apparently there:

On Fri, Jan 13, 2012 at 11:32 AM, Berend Hasselman b...@xs4all.nl wrote:

 WoutH wrote

 I've got a numeric vector with values ranging from 1 to 5, I would like to
 catagorize these values like this:

 1 becomes catagory 1
 3 becomes catagory 3
 And everything else in catagory 2. The simple function I wrote beneath
 works for single numeric data, but for some reason I am unable to feed it
 vectors. Any help would be appreciated, as I'm fairly new to R.

 function.123 - function(x)
     {
       x1 - ifelse(x == 1 | x == 3, return(x) ,return(2))
       return(x1)
     }


This function? Not in the original email as I received it, although it showed
up in Berend's reply.

I hope that it was a momentary glitch; greater disagreement between Nabble
and the email list will cause all sorts of fun. If the interface,
whatever it is,
starts stripping out code? I'll have to quit answering Nabble queries entirely.

Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Dimitri Liakhovitski

Dear Rers,

is there a way to color counties on a full US map based on a criterion
one establishes (i.e., all counties I assign the same number should be
the same color)?
I explored a bit and looks like the package maps might be of help.
library(maps)
One could get a map of the US: map('usa')
One could get countries within a US state: map('county', 'iowa', fill
= TRUE, col = palette())

Would it be possible to read in a file with counties and their
assignments (some counties have a 1, some counties have a 2, etc.) and
then have one map of the US with counties colored based on their
assignment?

Thanks a lot for any hint!


-- 
Dimitri Liakhovitski

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem Installing R to SuSE 10 via RPM

2012-01-13 Thread Gavin Blackburn

I think it's saying you need to install R-base before R-base-devel.

You'll need to add a cran repository as SUSE might not have the most up-to-date 
version of R.

This is the code for Ubuntu I assume it's the same, just change the distro and 
keyserver:

sudo apt-get update
sudo add-apt-repository 'deb http://cran.ma.imperial.ac.uk/bin/linux/ubuntu 
oneiric/'
gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
gpg -a --export E084DAB9 | sudo apt-key add -



You may also want to get Sun java, again, change distro and keyserver:

sudo add-apt-repository ppa:ferramroberto/java
sudo apt-key adv --recv-key --keyserver keyserver.ubuntu.com B725097B3ACC3965
sudo apt-get update
sudo apt-get install sun-java6-jdk sun-java6-plugin

Then run:


sudo apt-get install r-base r-base-dev
sudo R CMD javareconf

Cheers,

Gavin.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Matthew Pettis
Sent: 13 January 2012 16:43
To: r-help@r-project.org
Subject: [R] Problem Installing R to SuSE 10 via RPM

Hi,

I'm trying to install R from an rpm locally to my account (the reason I'm
not doing it through yast/yast2/zypper is that the sys admin isn't yet
willing to install it, and doesn't want to support it, but will help me
support it if I install it locally -- in short, policy problems rather than
technical).  Below is the SuSE version, Kernel version, and rpm install
error I'm getting, as well as the error...

Can anyone help me with the error?  I'm trying to install R-base 2.14.1,
but it is telling me that I need R-base version 2.14.1 as a dependency.  Am
I using the wrong rpm for an installation starting from scratch?

I got the rpm from:
http://download.opensuse.org/repositories/devel:/languages:/R:/base/SLE_10/x86_64/

Thanks,
Matt




pettis@swat:~/bin cat /etc/*-release
SUSE Linux Enterprise Server 10 (x86_64)
VERSION = 10
PATCHLEVEL = 2

pettis@swat:~/bin uname -a
Linux swat 2.6.16.60-0.34-smp #1 SMP Fri Jan 16 14:59:01 UTC 2009 x86_64
x86_64 x86_64 GNU/Linux

pettis@swat:~/bin rpm -ivh R-base-devel-2.14.1-30.1.x86_64.rpm
warning: R-base-devel-2.14.1-30.1.x86_64.rpm: Header V3 DSA signature:
NOKEY, key ID 793371fe
error: Failed dependencies:
R-base = 2.14.1 is needed by R-base-devel-2.14.1-30.1.x86_64

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem Installing R to SuSE 10 via RPM

2012-01-13 Thread Matthew Pettis

Thanks, will do!  I thought devel included base, but evidently, that's not
the case...

On Fri, Jan 13, 2012 at 10:58 AM, Gavin Blackburn 
gavin.blackb...@strath.ac.uk wrote:

 I think it's saying you need to install R-base before R-base-devel.

 You'll need to add a cran repository as SUSE might not have the most
 up-to-date version of R.

 This is the code for Ubuntu I assume it's the same, just change the distro
 and keyserver:

 sudo apt-get update
 sudo add-apt-repository 'deb
 http://cran.ma.imperial.ac.uk/bin/linux/ubuntu oneiric/'
 gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
 gpg -a --export E084DAB9 | sudo apt-key add -



 You may also want to get Sun java, again, change distro and keyserver:

 sudo add-apt-repository ppa:ferramroberto/java
 sudo apt-key adv --recv-key --keyserver keyserver.ubuntu.comB725097B3ACC3965
 sudo apt-get update
 sudo apt-get install sun-java6-jdk sun-java6-plugin

 Then run:


 sudo apt-get install r-base r-base-dev
 sudo R CMD javareconf

 Cheers,

 Gavin.

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Matthew Pettis
 Sent: 13 January 2012 16:43
 To: r-help@r-project.org
 Subject: [R] Problem Installing R to SuSE 10 via RPM

 Hi,

 I'm trying to install R from an rpm locally to my account (the reason I'm
 not doing it through yast/yast2/zypper is that the sys admin isn't yet
 willing to install it, and doesn't want to support it, but will help me
 support it if I install it locally -- in short, policy problems rather than
 technical).  Below is the SuSE version, Kernel version, and rpm install
 error I'm getting, as well as the error...

 Can anyone help me with the error?  I'm trying to install R-base 2.14.1,
 but it is telling me that I need R-base version 2.14.1 as a dependency.  Am
 I using the wrong rpm for an installation starting from scratch?

 I got the rpm from:

 http://download.opensuse.org/repositories/devel:/languages:/R:/base/SLE_10/x86_64/

 Thanks,
 Matt




 pettis@swat:~/bin cat /etc/*-release
 SUSE Linux Enterprise Server 10 (x86_64)
 VERSION = 10
 PATCHLEVEL = 2

 pettis@swat:~/bin uname -a
 Linux swat 2.6.16.60-0.34-smp #1 SMP Fri Jan 16 14:59:01 UTC 2009 x86_64
 x86_64 x86_64 GNU/Linux

 pettis@swat:~/bin rpm -ivh R-base-devel-2.14.1-30.1.x86_64.rpm
 warning: R-base-devel-2.14.1-30.1.x86_64.rpm: Header V3 DSA signature:
 NOKEY, key ID 793371fe
 error: Failed dependencies:
R-base = 2.14.1 is needed by R-base-devel-2.14.1-30.1.x86_64

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Do not seek to follow in the footsteps of the wise men of old. Seek what
they sought.

- Matsuo Munefusa (Basho)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Sarah Goslee

Hi,

You've just about got it. See below.

On Fri, Jan 13, 2012 at 11:52 AM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Dear Rers,

 is there a way to color counties on a full US map based on a criterion
 one establishes (i.e., all counties I assign the same number should be
 the same color)?
 I explored a bit and looks like the package maps might be of help.
 library(maps)
 One could get a map of the US: map('usa')
 One could get countries within a US state: map('county', 'iowa', fill
 = TRUE, col = palette())

Using a random sampling to give you the basic idea.
There are 99 counties in Iowa, so to construct the criterion:
countycol - sample(1:5, 99, replace=TRUE)
And to invent a set of colors (RColorBrewer is a better choice for
final maps):
classcolors - rainbow(5)

then you can use them in your map just as you would for any other
plotting command:

map('county', 'iowa', fill= TRUE, col = classcolors[countycol])

 Would it be possible to read in a file with counties and their
 assignments (some counties have a 1, some counties have a 2, etc.) and
 then have one map of the US with counties colored based on their
 assignment?

Absolutely. The only thing you have to watch out for is that you put your
values in the same order as:
map('county', 'iowa', plot=FALSE)$names

Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Dimitri Liakhovitski

Sarah, this is amazing, thank you so much.
One question: I am trying to do for the whole US (on one map) what
you've helped me do for Iowa.
In other words, I would like to create a file of inputs like you
countycol  with 1,000+ lines - for all US counties (probably without
Hawaii and Alaska, right?) and then somehow feed it into the map
command but so that the output is the whole map of the US, and not one
state. Is it at all possible?
Or maybe it's possible to create 48 colored state maps one by one -
the way you showed me - save them, and then somehow paste those
states onto the whole US map?
Thanks a lot for your help!
Dimitri

On Fri, Jan 13, 2012 at 12:05 PM, Sarah Goslee sarah.gos...@gmail.com wrote:
 Hi,

 You've just about got it. See below.

 On Fri, Jan 13, 2012 at 11:52 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Dear Rers,

 is there a way to color counties on a full US map based on a criterion
 one establishes (i.e., all counties I assign the same number should be
 the same color)?
 I explored a bit and looks like the package maps might be of help.
 library(maps)
 One could get a map of the US: map('usa')
 One could get countries within a US state: map('county', 'iowa', fill
 = TRUE, col = palette())

 Using a random sampling to give you the basic idea.
 There are 99 counties in Iowa, so to construct the criterion:
 countycol - sample(1:5, 99, replace=TRUE)
 And to invent a set of colors (RColorBrewer is a better choice for
 final maps):
 classcolors - rainbow(5)

 then you can use them in your map just as you would for any other
 plotting command:

 map('county', 'iowa', fill= TRUE, col = classcolors[countycol])

 Would it be possible to read in a file with counties and their
 assignments (some counties have a 1, some counties have a 2, etc.) and
 then have one map of the US with counties colored based on their
 assignment?

 Absolutely. The only thing you have to watch out for is that you put your
 values in the same order as:
 map('county', 'iowa', plot=FALSE)$names

 Sarah

 --
 Sarah Goslee
 http://www.functionaldiversity.org



-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] access/row access/col access

2012-01-13 Thread R. Michael Weylandt michael.weyla...@gmail.com

Like Sarah said, what you say below makes very little sense, but as a total 
shot in the dark, is this what you mean?

lapply(1:nrow(df), function(i) x[i,] )

Michael

On Jan 13, 2012, at 11:15 AM, Sarah Goslee sarah.gos...@gmail.com wrote:

 I have to admit, I have very little idea what you are trying to do. Can you
 provide an example?
 
 In general, the i-th column of a data frame can be accessed with
 
 mydataframe[, i]
 
 but that doesn't help with whatever you want to do with apply().
 
 Sarah
 
 On Fri, Jan 13, 2012 at 10:45 AM, statquant2 statqu...@gmail.com wrote:
 Hello,
 I have a data.frame and I want to transfor it in a list of rows or columns.
 I can do apply(myDataFrame,MARGIN=1,FUN=???)
 
 I remember that there is a function which mean return or access column ...
 something like :: or ], or [,
 I can't remember can somebody refresh my memory?
 
 -- 
 Sarah Goslee
 http://www.functionaldiversity.org
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Sarah Goslee

On Fri, Jan 13, 2012 at 12:15 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Sarah, this is amazing, thank you so much.
 One question: I am trying to do for the whole US (on one map) what
 you've helped me do for Iowa.
 In other words, I would like to create a file of inputs like you
 countycol  with 1,000+ lines - for all US counties (probably without
 Hawaii and Alaska, right?) and then somehow feed it into the map
 command but so that the output is the whole map of the US, and not one
 state. Is it at all possible?

Sure. Just start with
map('county')
instead.
I like to add something like:
map('state', lwd=3, add=TRUE)

You'll need to instead coordinate with the names of the entire US:
 length(map('county', plot=FALSE)$names)
[1] 3082

Sarah

 Or maybe it's possible to create 48 colored state maps one by one -
 the way you showed me - save them, and then somehow paste those
 states onto the whole US map?
 Thanks a lot for your help!
 Dimitri

 On Fri, Jan 13, 2012 at 12:05 PM, Sarah Goslee sarah.gos...@gmail.com wrote:
 Hi,

 You've just about got it. See below.

 On Fri, Jan 13, 2012 at 11:52 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Dear Rers,

 is there a way to color counties on a full US map based on a criterion
 one establishes (i.e., all counties I assign the same number should be
 the same color)?
 I explored a bit and looks like the package maps might be of help.
 library(maps)
 One could get a map of the US: map('usa')
 One could get countries within a US state: map('county', 'iowa', fill
 = TRUE, col = palette())

 Using a random sampling to give you the basic idea.
 There are 99 counties in Iowa, so to construct the criterion:
 countycol - sample(1:5, 99, replace=TRUE)
 And to invent a set of colors (RColorBrewer is a better choice for
 final maps):
 classcolors - rainbow(5)

 then you can use them in your map just as you would for any other
 plotting command:

 map('county', 'iowa', fill= TRUE, col = classcolors[countycol])

 Would it be possible to read in a file with counties and their
 assignments (some counties have a 1, some counties have a 2, etc.) and
 then have one map of the US with counties colored based on their
 assignment?

 Absolutely. The only thing you have to watch out for is that you put your
 values in the same order as:
 map('county', 'iowa', plot=FALSE)$names



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Portfolio Optimization

2012-01-13 Thread Sal Pellettieri

Hi,

I'm an R newbie and I've been struggling with a optimization problem for
the past couple of days now.

Here's the problem - I have a matrix of expected payouts from different
stock option strategies. Each column in my matrix represents a different
stock and each row represents the return to the strategy given a certain
market move. So the rows are not a time series of percentage returns but a
dollar payout in different expected scenarios, i.e.

Expected Return Matrix (ER) =   stock1   stock2   stockn
   scenario1   $  $
  $
   scenario2   $  $
  $
   scenario3   $  $
  $
   ...

I want to create an optimal portfolio of these strategies by applying a
vector of weights. The weights will be the number of contracts of each to
buy and won't be a percentage weighting. There are a few constraints I need
it comply with:

   - The weights have to be integers
   - The minimum portfolio return (ER* Weights) across the scenarios has to
   be greater than some negative number I specify
   - There has to be a certain minimum number of stocks in the portfolio so
   length(weights)some number I specify.

Any help is GREATLY appreciated since I have tried so many different
functions and packages. Even if someone can just lead me to the correct
function to use that would be a great help as I've looked at optim,
solveLP, ROI package and many others.


Thanks,
S

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Brillouin index

2012-01-13 Thread Philipp Fischer

Dear colleagues.

I wonder if anybody knows about a procedure in R to calculate the Brillouin 
Diversity index. 

I searched the net but did not find anything about it.

Thanks a lot for any help

Best, Philipp
*** 
Prof. Dr. Philipp Fischer
Head of AWI Center for Scientific Diving  Dept. In situ Ecology
Section Shelf Sea Systems
Alfred-Wegener-Institut
Biologische Anstalt Helgoland
Building A
D-27498 Helgoland
 
Phone: +49(4725)819-3344
Skype: fischer_philipp
Fax: +49(4725)819-3369
 
http://www.awi.de/People/show?pfischer
http://www.awi.de/en/infrastructure/underwater/scientific_diving/
http://www.awi.de/en/research/research_divisions/biosciences/shelf_sea_ecology/
http://www.forschungstauchen-deutschland.de




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] tm package, custom reader

2012-01-13 Thread pl.r...@gmail.com

I need help with creating custom xml reader for use with the tm package.  The
objective is to crate a corpus for analysis.  Files that I'm working with
come from solr and are in a funky XML format never the less I'm able to
parse the XML files using  solrDocs.R function provided by Duncan Temple
Lang.  

The problem I'm having that once I parse the document I need to create a
custom reader that would be compatible with the  tm package.  

If someone build a custom reader for tm package, or has some ideas of how to
go about this,  I would greatly appreciate the help.

Thanks 

--
View this message in context: 
http://r.789695.n4.nabble.com/tm-package-custom-reader-tp4292766p4292766.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] meta-analysis normal quantile plot metafor

2012-01-13 Thread Michael Dewey


At 15:53 11/01/2012, Ricc wrote:

Hello,

I once used the metawin software to perform a meta-analysis (see
metawinsoft, Rosenberg et al.) and produced normal qqplot to test for
a potential bias in the dataset.
I now want to re-use the same dataset with the package metafor by W.
Viechtbauer (great package btw).

I run the qqnorm.rma.uni function. I use standardized effect sizes as
in metawin.


I think it would help if you said which parameters you used to 
control the envelope. Did you smooth it? Did you use the Bonferroni correction?




QQplot generated with metafor differs from the plot obtained with
metawin: most of the datapoint fall outside the confidence envelope
(using the same confidence level). I don't understand very well how
the pseudo confidence envelope was created in metafor. Is it more
conservative than that from metawin or created using the package
envelope ? Unfortunately I do not have access to metawin's code so
that I cannot compare implementations but the manual let me think that
metawin print classical confidence interval...

Thanks for input !
Ricc

More precisions:
R version 2.13.1 (2011-07-08)
Platform: x86_64-pc-linux-gnu (64-bit)
metafor_1.6-0


Michael Dewey
i...@aghmed.fsnet.co.uk
http://www.aghmed.fsnet.co.uk/home.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fUtilities removed

2012-01-13 Thread Dominic Comtois

When setting up my new machine, I had the surprise to see that Package
'fUtilities' was removed from the CRAN repository.

 

This is problematic for my work. I use many of its functions, and it will
complicate things a lot if other programmers want to use my previous code in
the future. Plus, nowhere can I find the justification for its removal.

 

Thanks for any info on this


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] GLHT in multcomp: Two similar models, one doesn't work

2012-01-13 Thread gaiarrido

Nobody?

-
Mario Garrido Escudero
PhD student
Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola
Universidad de Salamanca
--
View this message in context: 
http://r.789695.n4.nabble.com/GLHT-in-multcomp-Two-similar-models-one-doesn-t-work-tp4291875p4292889.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem Installing R to SuSE 10 via RPM

2012-01-13 Thread Marc Schwartz

Two things:

1. Looking at the sizes of the RPMs at the URL provided, the 'R-base-devel' RPM 
is only 84k. So as with other Linux distros, that is likely to contain R header 
files and other such things to enable compilation of CRAN packages during their 
installation using the source tarballs. The 'R-base' RPM is 33Mb, so that is 
your primary install for R. You will want both.

2. I don't know about the configuration of the RPMs on SUSE, but presume that 
as with RH/Fedora, the RPMs can be created in such a fashion that they are not 
relocatable. That means that they are configured to be only installed 
'system-wide', not to a specific user's directory tree. You may have issues 
trying to install them without root access on your computer, thus may get other 
permission related errors when you get the correct RPM. Whether they are or are 
not relocatable would be a question for the SUSE maintainer for the R package. 
If they are, there are 'rpm' command line arguments that are relevant. So use 
'man rpm' for more details. If the RPMs are not relocatable, you will be left 
with the option of building R from source in order to install R locally and 
will want to read the R Installation and Administration manual for details.

HTH,

Marc Schwartz

On Jan 13, 2012, at 11:00 AM, Matthew Pettis wrote:

 Thanks, will do!  I thought devel included base, but evidently, that's not
 the case...
 
 On Fri, Jan 13, 2012 at 10:58 AM, Gavin Blackburn 
 gavin.blackb...@strath.ac.uk wrote:
 
 I think it's saying you need to install R-base before R-base-devel.
 
 You'll need to add a cran repository as SUSE might not have the most
 up-to-date version of R.
 
 This is the code for Ubuntu I assume it's the same, just change the distro
 and keyserver:
 
 sudo apt-get update
 sudo add-apt-repository 'deb
 http://cran.ma.imperial.ac.uk/bin/linux/ubuntu oneiric/'
 gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
 gpg -a --export E084DAB9 | sudo apt-key add -
 
 
 
 You may also want to get Sun java, again, change distro and keyserver:
 
 sudo add-apt-repository ppa:ferramroberto/java
 sudo apt-key adv --recv-key --keyserver keyserver.ubuntu.comB725097B3ACC3965
 sudo apt-get update
 sudo apt-get install sun-java6-jdk sun-java6-plugin
 
 Then run:
 
 
 sudo apt-get install r-base r-base-dev
 sudo R CMD javareconf
 
 Cheers,
 
 Gavin.
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Matthew Pettis
 Sent: 13 January 2012 16:43
 To: r-help@r-project.org
 Subject: [R] Problem Installing R to SuSE 10 via RPM
 
 Hi,
 
 I'm trying to install R from an rpm locally to my account (the reason I'm
 not doing it through yast/yast2/zypper is that the sys admin isn't yet
 willing to install it, and doesn't want to support it, but will help me
 support it if I install it locally -- in short, policy problems rather than
 technical).  Below is the SuSE version, Kernel version, and rpm install
 error I'm getting, as well as the error...
 
 Can anyone help me with the error?  I'm trying to install R-base 2.14.1,
 but it is telling me that I need R-base version 2.14.1 as a dependency.  Am
 I using the wrong rpm for an installation starting from scratch?
 
 I got the rpm from:
 
 http://download.opensuse.org/repositories/devel:/languages:/R:/base/SLE_10/x86_64/
 
 Thanks,
 Matt
 
 
 
 
 pettis@swat:~/bin cat /etc/*-release
 SUSE Linux Enterprise Server 10 (x86_64)
 VERSION = 10
 PATCHLEVEL = 2
 
 pettis@swat:~/bin uname -a
 Linux swat 2.6.16.60-0.34-smp #1 SMP Fri Jan 16 14:59:01 UTC 2009 x86_64
 x86_64 x86_64 GNU/Linux
 
 pettis@swat:~/bin rpm -ivh R-base-devel-2.14.1-30.1.x86_64.rpm
 warning: R-base-devel-2.14.1-30.1.x86_64.rpm: Header V3 DSA signature:
 NOKEY, key ID 793371fe
 error: Failed dependencies:
   R-base = 2.14.1 is needed by R-base-devel-2.14.1-30.1.x86_64

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] multidimensional array calculation

2012-01-13 Thread Johannes Radinger

Hello,

probably it is quite easy but I can get it: I have
mulitple numeric vectors and a function using
all of them to calculate a new value:

L - c(200,400,600)
AR - c(1.5)
SO - c(1,3,5)
T - c(30,365)

fun - function(L,AR,SO,T){
exp(L*AR+sqrt(SO)*log(T))
}

How can I get an array or dataframe where
all possible combinations of the factors are listed
and the new value is calculated.

I thought about an array like:
array(NA, dim = c(3,1,3,2), 
dimnames=list(c(200,400,600),c(1.5),c(1,3,5),c(30,365)))

but how can I get the array populated according to the function?

As I want to get in the end a 2D dataframe I probably will use the melt.array()
function from the reshape package or is there another way to get simple such
a full-factorial dataframe with all possible combinations?

Best regards,
Johannes
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] access/row access/col access

2012-01-13 Thread David Winsemius



On Jan 13, 2012, at 10:45 AM, statquant2 wrote:


Hello,
I have a data.frame and I want to transfor it in a list of rows or  
columns.

I can do apply(myDataFrame,MARGIN=1,FUN=???)

I remember that there is a function which mean return or access  
column ...

something like :: or ], or [,
I can't remember can somebody refresh my memory?


It is definitely the long way around) to do this, since [,5] would  
get the same results, but if you want the fifth element in each row  
you could do this:


 dat
  id x1 x2 x3 y1 y2 y3  z1  z2  z3   v
1  1  2  4  5 10 20 15 200 150 170 2.5
2  2  3  7  6 25 35 40 300 350 400 4.2
 apply(dat, 1, [, 5)
[1] 10 25

When 'apply' is used, the trailing arguments are matched either by  
position or name to the arguments of the function. In this case the 5  
gets matched to the i for the [ function. Because [ is primitive  
the name is actually ignored even if offered, so using any other name  
would not change the result:


 apply(dat, 1, [, j=5)
[1] 10 25

Whereas if you were using quantile as your function, you might perhaps  
use prob=c(,25,,75), na.rm=TRUE or na.rm=TRUE prob=c(,25,,75), prob=c(, 
25,,75)


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Dimitri Liakhovitski

Thank you somuch, Sarah. I tried, and it's working just wonderfully
(code below).
One last question, if I may: is it possible to get rid of borders
between counties (just leave the fill)? I did not find this argument
in help...
Thank you!
Dimitri

### My criterion for all counties.:
allcounties-data.frame(county=map('county', plot=FALSE)$names)
allcounties$group-c(rep(1:6,513),rep(1,4))[order(c(rep(1:6,513),rep(1,4)))]
### My colors:
classcolors - rainbow(6)
map('county',fill=TRUE,col=classcolors[allcounties$group])
map('state', lwd=2, add=TRUE)



 Sure. Just start with
 map('county')
 instead.
 I like to add something like:
 map('state', lwd=3, add=TRUE)

I am trying:
### My criterion for all counties in the US:
allcounties-data.frame(county=map('county', plot=FALSE)$names)
allcounties$group-sample(1:5,3082,replace=TRUE)
### My colors:
classcolors - rainbow(5)
### Trying to build the map - not working:
map(database='usa',regions='county',fill=TRUE,col=classcolors[allcounties$group])



 You'll need to instead coordinate with the names of the entire US:
 length(map('county', plot=FALSE)$names)
 [1] 3082

 Sarah

 Or maybe it's possible to create 48 colored state maps one by one -
 the way you showed me - save them, and then somehow paste those
 states onto the whole US map?
 Thanks a lot for your help!
 Dimitri

 On Fri, Jan 13, 2012 at 12:05 PM, Sarah Goslee sarah.gos...@gmail.com 
 wrote:
 Hi,

 You've just about got it. See below.

 On Fri, Jan 13, 2012 at 11:52 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Dear Rers,

 is there a way to color counties on a full US map based on a criterion
 one establishes (i.e., all counties I assign the same number should be
 the same color)?
 I explored a bit and looks like the package maps might be of help.
 library(maps)
 One could get a map of the US: map('usa')
 One could get countries within a US state: map('county', 'iowa', fill
 = TRUE, col = palette())

 Using a random sampling to give you the basic idea.
 There are 99 counties in Iowa, so to construct the criterion:
 countycol - sample(1:5, 99, replace=TRUE)
 And to invent a set of colors (RColorBrewer is a better choice for
 final maps):
 classcolors - rainbow(5)

 then you can use them in your map just as you would for any other
 plotting command:

 map('county', 'iowa', fill= TRUE, col = classcolors[countycol])

 Would it be possible to read in a file with counties and their
 assignments (some counties have a 1, some counties have a 2, etc.) and
 then have one map of the US with counties colored based on their
 assignment?

 Absolutely. The only thing you have to watch out for is that you put your
 values in the same order as:
 map('county', 'iowa', plot=FALSE)$names



 --
 Sarah Goslee
 http://www.functionaldiversity.org



-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Merging data XXXX

2012-01-13 Thread Dan Abner

Hello everyone,

I have 1 data frame (just a vector in the example below) with 12
individuals listed and a separate vector of 36 days (in week intervals).
What is the best way to merge these together so that each individual
(specialist here) has all 36 days matched with their specialist number (a
one to many merge in SAS; essentially resulting in long format data).

implement-as.Date(2012-4-30)
start-implement-91
weeks-seq(start,by=weeks,length=36)
weeks
specialist-1:12


Thanks!

Dan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Dimitri Liakhovitski

Just to clarify, according to help about the fill argument:
logical flag that says whether to draw lines or fill areas. If FALSE,
the lines bounding each region will be drawn (but only once, for
interior lines). If TRUE, each region will be filled using colors from
the col = argument, and bounding lines will not be drawn.
We have fill=TRUE - so why are the county borders still drawn?
Thank you!
Dimitri


On Fri, Jan 13, 2012 at 1:41 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Thank you somuch, Sarah. I tried, and it's working just wonderfully
 (code below).
 One last question, if I may: is it possible to get rid of borders
 between counties (just leave the fill)? I did not find this argument
 in help...
 Thank you!
 Dimitri

 ### My criterion for all counties.:
 allcounties-data.frame(county=map('county', plot=FALSE)$names)
 allcounties$group-c(rep(1:6,513),rep(1,4))[order(c(rep(1:6,513),rep(1,4)))]
 ### My colors:
 classcolors - rainbow(6)
 map('county',fill=TRUE,col=classcolors[allcounties$group])
 map('state', lwd=2, add=TRUE)



 Sure. Just start with
 map('county')
 instead.
 I like to add something like:
 map('state', lwd=3, add=TRUE)

 I am trying:
 ### My criterion for all counties in the US:
 allcounties-data.frame(county=map('county', plot=FALSE)$names)
 allcounties$group-sample(1:5,3082,replace=TRUE)
 ### My colors:
 classcolors - rainbow(5)
 ### Trying to build the map - not working:
 map(database='usa',regions='county',fill=TRUE,col=classcolors[allcounties$group])



 You'll need to instead coordinate with the names of the entire US:
 length(map('county', plot=FALSE)$names)
 [1] 3082

 Sarah

 Or maybe it's possible to create 48 colored state maps one by one -
 the way you showed me - save them, and then somehow paste those
 states onto the whole US map?
 Thanks a lot for your help!
 Dimitri

 On Fri, Jan 13, 2012 at 12:05 PM, Sarah Goslee sarah.gos...@gmail.com 
 wrote:
 Hi,

 You've just about got it. See below.

 On Fri, Jan 13, 2012 at 11:52 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Dear Rers,

 is there a way to color counties on a full US map based on a criterion
 one establishes (i.e., all counties I assign the same number should be
 the same color)?
 I explored a bit and looks like the package maps might be of help.
 library(maps)
 One could get a map of the US: map('usa')
 One could get countries within a US state: map('county', 'iowa', fill
 = TRUE, col = palette())

 Using a random sampling to give you the basic idea.
 There are 99 counties in Iowa, so to construct the criterion:
 countycol - sample(1:5, 99, replace=TRUE)
 And to invent a set of colors (RColorBrewer is a better choice for
 final maps):
 classcolors - rainbow(5)

 then you can use them in your map just as you would for any other
 plotting command:

 map('county', 'iowa', fill= TRUE, col = classcolors[countycol])

 Would it be possible to read in a file with counties and their
 assignments (some counties have a 1, some counties have a 2, etc.) and
 then have one map of the US with counties colored based on their
 assignment?

 Absolutely. The only thing you have to watch out for is that you put your
 values in the same order as:
 map('county', 'iowa', plot=FALSE)$names



 --
 Sarah Goslee
 http://www.functionaldiversity.org



 --
 Dimitri Liakhovitski
 marketfusionanalytics.com



-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Sarah Goslee

On Fri, Jan 13, 2012 at 1:41 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Thank you somuch, Sarah. I tried, and it's working just wonderfully
 (code below).
 One last question, if I may: is it possible to get rid of borders
 between counties (just leave the fill)? I did not find this argument
 in help...

But the help does say that additional arguments are passed to lines(),
so you can use lty=0.

That can leave white bits between counties if the areas don't line up
precisely, so I think it looks better with the lines in black.

Sarah

 Thank you!
 Dimitri

 ### My criterion for all counties.:
 allcounties-data.frame(county=map('county', plot=FALSE)$names)
 allcounties$group-c(rep(1:6,513),rep(1,4))[order(c(rep(1:6,513),rep(1,4)))]
 ### My colors:
 classcolors - rainbow(6)
 map('county',fill=TRUE,col=classcolors[allcounties$group])
 map('state', lwd=2, add=TRUE)



 Sure. Just start with
 map('county')
 instead.
 I like to add something like:
 map('state', lwd=3, add=TRUE)

 I am trying:
 ### My criterion for all counties in the US:
 allcounties-data.frame(county=map('county', plot=FALSE)$names)
 allcounties$group-sample(1:5,3082,replace=TRUE)
 ### My colors:
 classcolors - rainbow(5)
 ### Trying to build the map - not working:
 map(database='usa',regions='county',fill=TRUE,col=classcolors[allcounties$group])



 You'll need to instead coordinate with the names of the entire US:
 length(map('county', plot=FALSE)$names)
 [1] 3082

 Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fUtilities removed

2012-01-13 Thread David Winsemius



On Jan 13, 2012, at 12:33 PM, Dominic Comtois wrote:

When setting up my new machine, I had the surprise to see that  
Package

'fUtilities' was removed from the CRAN repository.



https://stat.ethz.ch/pipermail/rmetrics-core/2012-January/000554.html
https://stat.ethz.ch/pipermail/rmetrics-core/2011-November/000549.html




This is problematic for my work. I use many of its functions, and it  
will
complicate things a lot if other programmers want to use my previous  
code in
the future. Plus, nowhere can I find the justification for its  
removal.


You need to send your questions to the maintainers. They apparently  
did not respond to the requests to fix the errors.




Thanks for any info on this


You should perhaps subscribe to the list that is established for  
discussion on this and related packages.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merging data XXXX

2012-01-13 Thread Marc Schwartz


On Jan 13, 2012, at 12:42 PM, Dan Abner wrote:

 Hello everyone,
 
 I have 1 data frame (just a vector in the example below) with 12
 individuals listed and a separate vector of 36 days (in week intervals).
 What is the best way to merge these together so that each individual
 (specialist here) has all 36 days matched with their specialist number (a
 one to many merge in SAS; essentially resulting in long format data).
 
 implement-as.Date(2012-4-30)
 start-implement-91
 weeks-seq(start,by=weeks,length=36)
 weeks
 specialist-1:12
 
 
 Thanks!
 
 Dan


Two ways:

  merge(specialist, weeks)

  expand.grid(specialist, weeks)

See ?merge which performs a SQL-like join operation and ?expand.grid which 
provides for all possible combinations of two or more vectors

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Sarah Goslee

On Fri, Jan 13, 2012 at 1:52 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Just to clarify, according to help about the fill argument:
 logical flag that says whether to draw lines or fill areas. If FALSE,
 the lines bounding each region will be drawn (but only once, for
 interior lines). If TRUE, each region will be filled using colors from
 the col = argument, and bounding lines will not be drawn.
 We have fill=TRUE - so why are the county borders still drawn?
 Thank you!
 Dimitri


This prompted me to check the code:

if fill=TRUE, map() calls polygon()
if fill=FALSE, map() calls lines()

But polygon() draws borders by default.
 plot(c(0,1), c(0,1), type=n)
 polygon(c(0,0,1,1), c(0,1,1,0), col=yellow)

To not draw borders, the border argument is provided:
 plot(c(0,1), c(0,1), type=n)
 polygon(c(0,0,1,1), c(0,1,1,0), col=yellow, border=NA)

But that fails in map():
 map('county', 'iowa', fill=TRUE, col=rainbow(20), border=NA)
Error in par(pin = p) :
  invalid value specified for graphical parameter pin

because border is used as a named argument in map() already, for setting the
size of the plot area, so there's no way to alter the border argument
to polygon.

The work-around I suggested previous (lty=0) seems to be the only
way to deal with the problem.

Sarah

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merging data XXXX

2012-01-13 Thread David Winsemius



On Jan 13, 2012, at 1:42 PM, Dan Abner wrote:


Hello everyone,

I have 1 data frame (just a vector in the example below) with 12
individuals listed and a separate vector of 36 days (in week  
intervals).

What is the best way to merge these together so that each individual
(specialist here) has all 36 days matched with their specialist  
number (a

one to many merge in SAS; essentially resulting in long format data).

implement-as.Date(2012-4-30)
start-implement-91
weeks-seq(start,by=weeks,length=36)
weeks
specialist-1:12


?rep
?data.frame

I would not think 'merge' is needed ... you just want to make the  
right number of copies and you should be paying attention to the  
'each' and 'times' parameters.



[[alternative HTML version deleted]]

The above suggests failure adhere to the below

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grplasso

2012-01-13 Thread Scott Raynaud

So does anyone use this package?

- Original Message -
From: Scott Raynaud scott.rayn...@yahoo.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Tuesday, January 10, 2012 1:40 PM
Subject: grplasso

I want to use the grplasso package on a data set where I want to fit a linear 
model.  My interest is in identifying significant beta coefficients.  The 
documentation is a bit cryptic so I'd appreciate some help.

I know this is a strategy for large numbers of variables but consider a simple 
case for pedagogical puposes.  Say I have two 3 category predictors (2 dummies 
each), a binary predictor and a continuous predictor with a continuous outcome:

y  x1  x2  x3  x4 x5 x6
rows of data here
..
..

Naturally, I want to select x1 and x2 as a group and x3 and x4 as another 
group.  
The documentation has a couple of examples but it's not clear how they 
translate 
to the current problem.  How do I specify my groups and run the lasso 
regression?

Looks like this is the grouping part:

index-c(NA,)

but I'm not sure how to specify the df for the variables past the NA for the 
intercept.

Once that's defined the penalty can be specified:

lambda - lambdamax(x, y = y, index = index, penscale = sqrt,
model = LogReg()) * 0.5^(0:5) 
In my case I'd use LinReg for the model.  

Then the model:

fit - grplasso(x, y = y, index = index, lambda = lambda, model = LogReg(),
penscale = sqrt, control = grpl.control(update.hess = lambda, trace = 0))

again using LinReg for the model.

This can be plotted against lambda, but when I do lasso regression 
in other software I end up with a plot of the coefficients against the 
tuning parameter with a cutpoint or a table and graph that tells me 
what to include in the model based on some selected criterion.  
It's not clear from the example if there's a cross-validation or some 
other procedure to determine what variables to include.  Plot(fit) 
produces a graph of coefficients against lambda but nothig to indicate 
what to include.  What is used in the package, if anything, to make that 
determination?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multidimensional array calculation

2012-01-13 Thread Jean V Adams

See
?expand.grid

For example,
df - expand.grid(L=L, AR=AR, SO=SO, T=T)
df$y - fun(df$L, df$AR, df$SO, df$T)

Jean


Johannes Radinger wrote on 01/13/2012 12:28:46 PM:

 Hello,
 
 probably it is quite easy but I can get it: I have
 mulitple numeric vectors and a function using
 all of them to calculate a new value:
 
 L - c(200,400,600)
 AR - c(1.5)
 SO - c(1,3,5)
 T - c(30,365)
 
 fun - function(L,AR,SO,T){
exp(L*AR+sqrt(SO)*log(T))
 }
 
 How can I get an array or dataframe where
 all possible combinations of the factors are listed
 and the new value is calculated.
 
 I thought about an array like:
 array(NA, dim = c(3,1,3,2), dimnames=list(c(200,400,600),c(1.5),c(1,
 3,5),c(30,365)))
 
 but how can I get the array populated according to the function?
 
 As I want to get in the end a 2D dataframe I probably will use the 
 melt.array()
 function from the reshape package or is there another way to get simple 
such
 a full-factorial dataframe with all possible combinations?
 
 Best regards,
 Johannes
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multidimensional array calculation

2012-01-13 Thread R. Michael Weylandt

Perhaps repeated use of the outer() function. You could also write a
multi.outer() or adopt one of the solutions here:
http://stackoverflow.com/questions/6192848/how-to-generalize-outer-to-n-dimensions

Michael

On Fri, Jan 13, 2012 at 1:28 PM, Johannes Radinger jradin...@gmx.at wrote:
 Hello,

 probably it is quite easy but I can get it: I have
 mulitple numeric vectors and a function using
 all of them to calculate a new value:

 L - c(200,400,600)
 AR - c(1.5)
 SO - c(1,3,5)
 T - c(30,365)

 fun - function(L,AR,SO,T){
        exp(L*AR+sqrt(SO)*log(T))
 }

 How can I get an array or dataframe where
 all possible combinations of the factors are listed
 and the new value is calculated.

 I thought about an array like:
 array(NA, dim = c(3,1,3,2), 
 dimnames=list(c(200,400,600),c(1.5),c(1,3,5),c(30,365)))

 but how can I get the array populated according to the function?

 As I want to get in the end a 2D dataframe I probably will use the 
 melt.array()
 function from the reshape package or is there another way to get simple such
 a full-factorial dataframe with all possible combinations?

 Best regards,
 Johannes
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting regression line in with lattice

2012-01-13 Thread Weidong Gu

Hi,

Since trt is a factor, you use it for indexing. try just delete in the code
fill - my.fill[combined$trt[subscripts]]

Weidong Gu

On Fri, Jan 13, 2012 at 11:30 AM, matteo dossena m.doss...@qmul.ac.uk wrote:
 #Dear All,
 #I'm having a bit of a trouble here, please help me...
 #I have this data
 set.seed(4)
 mydata - data.frame(var = rnorm(100),
                     temp = rnorm(100),
                     subj = as.factor(rep(c(1:10),5)),
                     trt = rep(c(A,B), 50))

 #and this model that fits them
 lm  - lm(var ~ temp * subj, data = mydata)

 #i want to plot the results with lattice anf fit the regression line, 
 predicted with my model, trough them
 #to do so, I'm using this approach, outlined  Lattice Tricks for the power 
 useR by D. Sarkar

 temp_rng - range(mydata$temp, finite = TRUE)

 grid - expand.grid(temp = do.breaks(temp_rng, 30),
                    subj = unique(mydata$subj),
                    trt = unique(mydata$trt))

 model - cbind(grid, var = predict(lm, newdata = grid))

 orig - mydata[c(var,temp,subj,trt)]

 combined - make.groups(original = orig, model = model)


 xyplot(var ~ temp | subj,
       data = combined,
       groups = which,
       type = c(p, l),
       distribute.type = TRUE
       )


 # so far every thing is fine, but, i also whant assign a filling to the data 
 points for the two treatments trt=1 and trt=2
 # so I have written this piece of code, that works fine, but when it comes to 
 plot the regression line, it seems that type is not recognized by the panel 
 function...

 my.fill - c(black, grey)

 plot - with(combined,
        xyplot(var ~ temp | subj,
              data = combined,
              group = combined$which,
              type = c(p, l),
              distribute.type = TRUE,
              panel = function(x, y, ..., subscripts){
                         fill - my.fill[combined$trt[subscripts]]
                         panel.xyplot(x, y, pch = 21, fill = my.fill, col = 
 black)
                         },
             key = list(space = right,
                     text = list(c(trt1, trt2), cex = 0.8),
                     points = list(pch = c(21), fill = c(black, grey)),
                     rep = FALSE)
                     )
      )
 plot

 #I've also tried to move type and distribute type within panel.xyplot, as 
 well as subsseting the data in it panel.xyplot like this

 plot - with(combined,
        xyplot(var ~ temp | subj,
              data = combined,
              panel = function(x, y, ..., subscripts){
                         fill - my.fill[combined$trt[subscripts]]
                         panel.xyplot(x[combined$which==original], 
 y[combined$which==original], pch = 21, fill = my.fill, col = black)
                         panel.xyplot(x[combined$which==model], 
 y[combined$which==model], type = l, col = black)
                         },
             key = list(space = right,
                     text = list(c(trt1, trt2), cex = 0.8),
                     points = list(pch = c(21), fill = c(black, grey)),
                     rep = FALSE)
                     )
      )
 plot

 #but no success with that either...
 #can anyone help me to get the predicted values plotted as a line instead of 
 being points?
 #really appricieate
 #matteo





        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem Installing R to SuSE 10 via RPM

2012-01-13 Thread Ken Hutchison

apt-get wont run on Suse unless you have aptitude, also your distribution won't 
compile a deb without some intervention. You won't be able to query the package 
regardless of key if your local proxy settings won't let you because of admin. 
Perhaps see if there is a tool to convert the deb package to rpm locally to 
install after downloading the deb, although building from source shouldn't be 
that challenging unless you have ridiculous Fortran compiler problems like I do 
with Linux.
 Hope that helps,
   Ken 


On Jan 13, 2012, at 1:13 PM, Marc Schwartz marc_schwa...@me.com wrote:

 Two things:
 
 1. Looking at the sizes of the RPMs at the URL provided, the 'R-base-devel' 
 RPM is only 84k. So as with other Linux distros, that is likely to contain R 
 header files and other such things to enable compilation of CRAN packages 
 during their installation using the source tarballs. The 'R-base' RPM is 
 33Mb, so that is your primary install for R. You will want both.
 
 2. I don't know about the configuration of the RPMs on SUSE, but presume that 
 as with RH/Fedora, the RPMs can be created in such a fashion that they are 
 not relocatable. That means that they are configured to be only installed 
 'system-wide', not to a specific user's directory tree. You may have issues 
 trying to install them without root access on your computer, thus may get 
 other permission related errors when you get the correct RPM. Whether they 
 are or are not relocatable would be a question for the SUSE maintainer for 
 the R package. If they are, there are 'rpm' command line arguments that are 
 relevant. So use 'man rpm' for more details. If the RPMs are not relocatable, 
 you will be left with the option of building R from source in order to 
 install R locally and will want to read the R Installation and Administration 
 manual for details.
 
 HTH,
 
 Marc Schwartz
 
 On Jan 13, 2012, at 11:00 AM, Matthew Pettis wrote:
 
 Thanks, will do!  I thought devel included base, but evidently, that's not
 the case...
 
 On Fri, Jan 13, 2012 at 10:58 AM, Gavin Blackburn 
 gavin.blackb...@strath.ac.uk wrote:
 
 I think it's saying you need to install R-base before R-base-devel.
 
 You'll need to add a cran repository as SUSE might not have the most
 up-to-date version of R.
 
 This is the code for Ubuntu I assume it's the same, just change the distro
 and keyserver:
 
 sudo apt-get update
 sudo add-apt-repository 'deb
 http://cran.ma.imperial.ac.uk/bin/linux/ubuntu oneiric/'
 gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
 gpg -a --export E084DAB9 | sudo apt-key add -
 
 
 
 You may also want to get Sun java, again, change distro and keyserver:
 
 sudo add-apt-repository ppa:ferramroberto/java
 sudo apt-key adv --recv-key --keyserver keyserver.ubuntu.comB725097B3ACC3965
 sudo apt-get update
 sudo apt-get install sun-java6-jdk sun-java6-plugin
 
 Then run:
 
 
 sudo apt-get install r-base r-base-dev
 sudo R CMD javareconf
 
 Cheers,
 
 Gavin.
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Matthew Pettis
 Sent: 13 January 2012 16:43
 To: r-help@r-project.org
 Subject: [R] Problem Installing R to SuSE 10 via RPM
 
 Hi,
 
 I'm trying to install R from an rpm locally to my account (the reason I'm
 not doing it through yast/yast2/zypper is that the sys admin isn't yet
 willing to install it, and doesn't want to support it, but will help me
 support it if I install it locally -- in short, policy problems rather than
 technical).  Below is the SuSE version, Kernel version, and rpm install
 error I'm getting, as well as the error...
 
 Can anyone help me with the error?  I'm trying to install R-base 2.14.1,
 but it is telling me that I need R-base version 2.14.1 as a dependency.  Am
 I using the wrong rpm for an installation starting from scratch?
 
 I got the rpm from:
 
 http://download.opensuse.org/repositories/devel:/languages:/R:/base/SLE_10/x86_64/
 
 Thanks,
 Matt
 
 
 
 
 pettis@swat:~/bin cat /etc/*-release
 SUSE Linux Enterprise Server 10 (x86_64)
 VERSION = 10
 PATCHLEVEL = 2
 
 pettis@swat:~/bin uname -a
 Linux swat 2.6.16.60-0.34-smp #1 SMP Fri Jan 16 14:59:01 UTC 2009 x86_64
 x86_64 x86_64 GNU/Linux
 
 pettis@swat:~/bin rpm -ivh R-base-devel-2.14.1-30.1.x86_64.rpm
 warning: R-base-devel-2.14.1-30.1.x86_64.rpm: Header V3 DSA signature:
 NOKEY, key ID 793371fe
 error: Failed dependencies:
  R-base = 2.14.1 is needed by R-base-devel-2.14.1-30.1.x86_64
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Dimitri Liakhovitski


 But the help does say that additional arguments are passed to lines(),
 so you can use lty=0.

 That can leave white bits between counties if the areas don't line up
 precisely, so I think it looks better with the lines in black.

I agree, indeed it leaves white bits. Of course, I could try to change
the type of the lines using lty...
Sorry for one more question: can one change the color of these lines
(borders between counties) from black?
Thank you!
D.



 Sarah

 Thank you!
 Dimitri

 ### My criterion for all counties.:
 allcounties-data.frame(county=map('county', plot=FALSE)$names)
 allcounties$group-c(rep(1:6,513),rep(1,4))[order(c(rep(1:6,513),rep(1,4)))]
 ### My colors:
 classcolors - rainbow(6)
 map('county',fill=TRUE,col=classcolors[allcounties$group])
 map('state', lwd=2, add=TRUE)



 Sure. Just start with
 map('county')
 instead.
 I like to add something like:
 map('state', lwd=3, add=TRUE)

 I am trying:
 ### My criterion for all counties in the US:
 allcounties-data.frame(county=map('county', plot=FALSE)$names)
 allcounties$group-sample(1:5,3082,replace=TRUE)
 ### My colors:
 classcolors - rainbow(5)
 ### Trying to build the map - not working:
 map(database='usa',regions='county',fill=TRUE,col=classcolors[allcounties$group])



 You'll need to instead coordinate with the names of the entire US:
 length(map('county', plot=FALSE)$names)
 [1] 3082

 Sarah

 --
 Sarah Goslee
 http://www.functionaldiversity.org



-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply transformation

2012-01-13 Thread Jean V Adams

Try this:

d_tmp2 - data.frame(apply(round(100*d_tmp[, -1], 2), 2, paste, %, 
sep=))
names(d_tmp2) - paste(names(d_tmp[, -1]), Lbl, sep=-)
d_final - cbind(d_tmp, d_tmp2)
d_final[, 1+c(0, order(names(d_final[, -1])))]

Jean


santosh wrote on 01/13/2012 04:55:51 AM:

 Hello All,
 
 I have the following dataset:
 
 Year   2006   2007
 Jan  Jan 0.0204 0.0065
 Feb  Feb 0.0145 0.0082
 Mar  Mar 0.0027 0.0122
 
 
  dput(d_tmp)
 structure(list(Year = c(Jan, Feb, Mar), `2006` = c(0.0204,
 0.0145, 0.0027), `2007` = c(0.0065, 0.0082, 0.0122)), .Names =
 c(Year,
 2006, 2007), row.names = c(Jan, Feb, Mar), class =
 data.frame)
 
 
 I am trying to use the apply function but the values seem to be
 getting coerced to characters. I could recast in my function ... but I
 suspect there should be an easier way.
 
 I can always use a for loop to get the output I need but just
 wondering if there a way to get the same using apply or some other
 function ... (the number of years can be changing in my requirement)
 
 My final output needs to be as follows:
 
 Year   2006   2006-Lbl   2007   2007-Lbl
 Jan   0.0204   '2.04%'   0.0065   '0.65%'
 Feb   0.0145   '1.45%'   0.0082   '0.82%'
 Mar   0.0027   '0.27%'   0.0122   '1.22%'
 
 i.e.
  dput(d_final)
 structure(list(Year = structure(c(2L, 1L, 3L), .Label = c(Feb,
 Jan, Mar), class = factor), X2006 = c(0.0204, 0.0145, 0.0027
 ), X2006.Lbl = structure(c(3L, 2L, 1L), .Label = c('0.27%',
 '1.45%', '2.04%'), class = factor), X2007 = c(0.0065, 0.0082,
 0.0122), X2007.Lbl = structure(1:3, .Label = c('0.65%', '0.82%',
 '1.22%'), class = factor)), .Names = c(Year, X2006,
 X2006.Lbl,
 X2007, X2007.Lbl), row.names = c(NA, -3L), class = data.frame)
 
 Please advise.
 
 Santosh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Sarah Goslee

Changing the color of the borders is what the border argument to
polygon() would do, if only it hadn't been overridden.

So no, you can't easily change the line color. It would be an easy
tweak to the code to add a polyborder argument that is passed
to polygon() as border, though. That would solve both your
problems.

Sarah

On Fri, Jan 13, 2012 at 3:13 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:

 But the help does say that additional arguments are passed to lines(),
 so you can use lty=0.

 That can leave white bits between counties if the areas don't line up
 precisely, so I think it looks better with the lines in black.

 I agree, indeed it leaves white bits. Of course, I could try to change
 the type of the lines using lty...
 Sorry for one more question: can one change the color of these lines
 (borders between counties) from black?
 Thank you!
 D.



 Sarah

 Thank you!
 Dimitri

 ### My criterion for all counties.:
 allcounties-data.frame(county=map('county', plot=FALSE)$names)
 allcounties$group-c(rep(1:6,513),rep(1,4))[order(c(rep(1:6,513),rep(1,4)))]
 ### My colors:
 classcolors - rainbow(6)
 map('county',fill=TRUE,col=classcolors[allcounties$group])
 map('state', lwd=2, add=TRUE)



 Sure. Just start with
 map('county')
 instead.
 I like to add something like:
 map('state', lwd=3, add=TRUE)

 I am trying:
 ### My criterion for all counties in the US:
 allcounties-data.frame(county=map('county', plot=FALSE)$names)
 allcounties$group-sample(1:5,3082,replace=TRUE)
 ### My colors:
 classcolors - rainbow(5)
 ### Trying to build the map - not working:
 map(database='usa',regions='county',fill=TRUE,col=classcolors[allcounties$group])



 You'll need to instead coordinate with the names of the entire US:
 length(map('county', plot=FALSE)$names)
 [1] 3082

 Sarah


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread David L Carlson

You can set the fg graphics parameter, for example

oldpar - par(fg='white') # change the default fg (foreground color) to
white
map('county', 'iowa', fill=TRUE, col='light gray')
oldpar # reset fg to the default - black
map('state', 'iowa', lwd=3, add=TRUE)

Assuming you want the state outline in black with the county boundaries in
white. Otherwise just eliminate the last map command.

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Dimitri Liakhovitski
Sent: Friday, January 13, 2012 2:14 PM
To: Sarah Goslee
Cc: r-help
Subject: Re: [R] Coloring counties on a full US map based on a certain
criterion


 But the help does say that additional arguments are passed to lines(),
 so you can use lty=0.

 That can leave white bits between counties if the areas don't line up
 precisely, so I think it looks better with the lines in black.

I agree, indeed it leaves white bits. Of course, I could try to change
the type of the lines using lty...
Sorry for one more question: can one change the color of these lines
(borders between counties) from black?
Thank you!
D.



 Sarah

 Thank you!
 Dimitri

 ### My criterion for all counties.:
 allcounties-data.frame(county=map('county', plot=FALSE)$names)

allcounties$group-c(rep(1:6,513),rep(1,4))[order(c(rep(1:6,513),rep(1,4)))]
 ### My colors:
 classcolors - rainbow(6)
 map('county',fill=TRUE,col=classcolors[allcounties$group])
 map('state', lwd=2, add=TRUE)



 Sure. Just start with
 map('county')
 instead.
 I like to add something like:
 map('state', lwd=3, add=TRUE)

 I am trying:
 ### My criterion for all counties in the US:
 allcounties-data.frame(county=map('county', plot=FALSE)$names)
 allcounties$group-sample(1:5,3082,replace=TRUE)
 ### My colors:
 classcolors - rainbow(5)
 ### Trying to build the map - not working:

map(database='usa',regions='county',fill=TRUE,col=classcolors[allcounties$gr
oup])



 You'll need to instead coordinate with the names of the entire US:
 length(map('county', plot=FALSE)$names)
 [1] 3082

 Sarah

 --
 Sarah Goslee
 http://www.functionaldiversity.org



-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Dimitri Liakhovitski

Sarah, David, thank you very much for your help! It all works now:

### My criterion for all counties:
allcounties-data.frame(county=map('county', plot=FALSE)$names)
allcounties$group-c(rep(1:6,513),rep(1,4))[order(c(rep(1:6,513),rep(1,4)))]
### My colors:
classcolors - rainbow(6)
### For gray border color:
oldpar - par(fg='gray') # change the default fg (foreground color) to white
### My US map:
map('county',fill=TRUE,col=classcolors[allcounties$group],lty=3,bg =
transparent)
### For state borders:
map('state', lwd=1, add=TRUE)

Dimitri


On Fri, Jan 13, 2012 at 3:24 PM, David L Carlson dcarl...@tamu.edu wrote:
 You can set the fg graphics parameter, for example

 oldpar - par(fg='white') # change the default fg (foreground color) to
 white
 map('county', 'iowa', fill=TRUE, col='light gray')
 oldpar # reset fg to the default - black
 map('state', 'iowa', lwd=3, add=TRUE)

 Assuming you want the state outline in black with the county boundaries in
 white. Otherwise just eliminate the last map command.

 --
 David L Carlson
 Associate Professor of Anthropology
 Texas AM University
 College Station, TX 77843-4352


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Dimitri Liakhovitski
 Sent: Friday, January 13, 2012 2:14 PM
 To: Sarah Goslee
 Cc: r-help
 Subject: Re: [R] Coloring counties on a full US map based on a certain
 criterion


 But the help does say that additional arguments are passed to lines(),
 so you can use lty=0.

 That can leave white bits between counties if the areas don't line up
 precisely, so I think it looks better with the lines in black.

 I agree, indeed it leaves white bits. Of course, I could try to change
 the type of the lines using lty...
 Sorry for one more question: can one change the color of these lines
 (borders between counties) from black?
 Thank you!
 D.



 Sarah

 Thank you!
 Dimitri

 ### My criterion for all counties.:
 allcounties-data.frame(county=map('county', plot=FALSE)$names)

 allcounties$group-c(rep(1:6,513),rep(1,4))[order(c(rep(1:6,513),rep(1,4)))]
 ### My colors:
 classcolors - rainbow(6)
 map('county',fill=TRUE,col=classcolors[allcounties$group])
 map('state', lwd=2, add=TRUE)



 Sure. Just start with
 map('county')
 instead.
 I like to add something like:
 map('state', lwd=3, add=TRUE)

 I am trying:
 ### My criterion for all counties in the US:
 allcounties-data.frame(county=map('county', plot=FALSE)$names)
 allcounties$group-sample(1:5,3082,replace=TRUE)
 ### My colors:
 classcolors - rainbow(5)
 ### Trying to build the map - not working:

 map(database='usa',regions='county',fill=TRUE,col=classcolors[allcounties$gr
 oup])



 You'll need to instead coordinate with the names of the entire US:
 length(map('county', plot=FALSE)$names)
 [1] 3082

 Sarah

 --
 Sarah Goslee
 http://www.functionaldiversity.org



 --
 Dimitri Liakhovitski
 marketfusionanalytics.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Ray Brownrigg


On 14/01/2012 8:04 a.m., Sarah Goslee wrote:

On Fri, Jan 13, 2012 at 1:52 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com  wrote:

Just to clarify, according to help about the fill argument:
logical flag that says whether to draw lines or fill areas. If FALSE,
the lines bounding each region will be drawn (but only once, for
interior lines). If TRUE, each region will be filled using colors from
the col = argument, and bounding lines will not be drawn.
We have fill=TRUE - so why are the county borders still drawn?
Thank you!
Dimitri


This prompted me to check the code:

if fill=TRUE, map() calls polygon()
if fill=FALSE, map() calls lines()

But polygon() draws borders by default.

plot(c(0,1), c(0,1), type=n)
polygon(c(0,0,1,1), c(0,1,1,0), col=yellow)

To not draw borders, the border argument is provided:

plot(c(0,1), c(0,1), type=n)
polygon(c(0,0,1,1), c(0,1,1,0), col=yellow, border=NA)

But that fails in map():

map('county', 'iowa', fill=TRUE, col=rainbow(20), border=NA)

Error in par(pin = p) :
   invalid value specified for graphical parameter pin

because border is used as a named argument in map() already, for setting the
size of the plot area, so there's no way to alter the border argument
to polygon.

Coincidentally, I became aware of this just recently.  When the maps 
package was created (way back in the 'new' S era), polygon() didn't 
add borders, and that is why ?map states that fill does not add 
borders.  A workaround is to change the map() option border= to 
myborder= (it is then used twice in map()).

The work-around I suggested previous (lty=0) seems to be the only
way to deal with the problem.

In fact I believe there is another workaround if you don't want to 
modify the code; use the option resolution=0 in the map() call. I.e. try 
(in Sarah's original Iowa example):


map('county', 'iowa', fill= TRUE, col = classcolors[countycol], 
resolution=0, lty=0)


This ensures that the polygon boundaries match up.

I'll fix the border issue in the next version of maps (*not* the one 
just uploaded to CRAN, which was to add Cibola County to NM).


Ray Brownrigg

Sarah


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Sarah Goslee

Hi Ray,

I'm glad to see you here. I was going to write this up a bit more
clearly and email it to you, but now I don't have to bother. :)

 Coincidentally, I became aware of this just recently.  When the maps package
 was created (way back in the 'new' S era), polygon() didn't add borders,
 and that is why ?map states that fill does not add borders.  A workaround is
 to change the map() option border= to myborder= (it is then used twice in
 map()).

I though it was probably a legacy code issue.

 In fact I believe there is another workaround if you don't want to modify
 the code; use the option resolution=0 in the map() call. I.e. try (in
 Sarah's original Iowa example):

 map('county', 'iowa', fill= TRUE, col = classcolors[countycol],
 resolution=0, lty=0)

 This ensures that the polygon boundaries match up.

Ah! That works nicely, and wasn't clear to me from the help that it
would do so.

Thanks!

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Portfolio Optimization

2012-01-13 Thread Enrico Schumann



I would be biased towards using a heuristic, for instance Threshold 
Accepting (TA), for solving such a problem. (TA is implemented in 
package NMOF. Disclosure: I am the author of that package.) But you will 
not find a ready-to-use solution there.


(1) you need an objective function, ie, a function that maps a given 
vector of holdings (and data like your scenario matrix) into a real 
number; the better the portfolio, the lower the number.


(2) For TA, you need a so-called neighbourhood function. That is a 
function that changes one portfolio vector into another, by changing 
some elements. Examples for simple neighbourhoods are in the package 
vignettes. Do you have a budget constraint? If yes, and you want to work 
with integers, I would suggest using a cash variable. (See, eg, 
Algorithm 3 in http://www.swissfinanceinstitute.ch/rp20.pdf )


(3) The constraints can, at least in a first round, be included through 
penalties.



Regards,
Enrico

PS. There is a mailing list dedicated to finance-with-R questions, and 
you may get better answers there.


https://stat.ethz.ch/mailman/listinfo/r-sig-finance



--
Enrico Schumann
Lucerne, Switzerland
http://nmof.net/


Am 13.01.2012 17:06, schrieb Sal Pellettieri:

Hi,

I'm an R newbie and I've been struggling with a optimization problem for
the past couple of days now.

Here's the problem - I have a matrix of expected payouts from different
stock option strategies. Each column in my matrix represents a different
stock and each row represents the return to the strategy given a certain
market move. So the rows are not a time series of percentage returns but a
dollar payout in different expected scenarios, i.e.

Expected Return Matrix (ER) =   stock1   stock2   stockn
scenario1   $  $
   $
scenario2   $  $
   $
scenario3   $  $
   $
...

I want to create an optimal portfolio of these strategies by applying a
vector of weights. The weights will be the number of contracts of each to
buy and won't be a percentage weighting. There are a few constraints I need
it comply with:

- The weights have to be integers
- The minimum portfolio return (ER* Weights) across the scenarios has to
be greater than some negative number I specify
- There has to be a certain minimum number of stocks in the portfolio so
length(weights)some number I specify.

Any help is GREATLY appreciated since I have tried so many different
functions and packages. Even if someone can just lead me to the correct
function to use that would be a great help as I've looked at optim,
solveLP, ROI package and many others.


Thanks,
S



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Averaging within a range of values

2012-01-13 Thread doggysaywhat

Hello Jeff, thank you for the reply.  I tried the cut function and I had two
questions.  How do I have the cut function take the first position in the
start column in df1 as the first cut point and the first position in column
2 as the second cut point.  The break variable seems to want a single
vector.  I tried compressing both vectors into one where I had say


200
700
500
1000
etc

then cut gives me the 200-500 range, 500-700, and 700-1000.  In this case I
wanted the range, 200-700, and 500-1000.  

Is there a way to define the first point of each cut as positions along the
START vector and all second points of the cut as positions along the END
vector?

I also had one additional question.  When playing around with this, I
noticed that I had to do this for the Pos column in the second data frame. 
But, when I get the ranges, how do I have it return the values in C0 or C1
in df2 that are in the same rows as those of the ranges?

Thanks again for the help.
 

--
View this message in context: 
http://r.789695.n4.nabble.com/Averaging-within-a-range-of-values-tp4291958p4293410.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Dimitri Liakhovitski

Everyone, thanks a lot - this is great:

### My criterion for all counties:
allcounties-data.frame(county=map('county', plot=FALSE)$names)
allcounties$group-c(rep(1:6,513),rep(1,4))[order(c(rep(1:6,513),rep(1,4)))]
### My colors:
classcolors - rainbow(6)

### 1. If I want to have no borders between counties:
map('county',fill=TRUE,col=classcolors[allcounties$group],resolution=0,lty=0,bg
= transparent)
map('state', lwd=1, add=TRUE)

# 2. If I want to see borders between counties (of a desired
color, e.g., gray):
### For line color:
oldpar - par(fg='gray') # change the default fg (foreground color) to white
### My US map:
map('county',fill=TRUE,col=classcolors[allcounties$group],lty=1,bg =
transparent)
par(oldpar)

Dimitri

On Fri, Jan 13, 2012 at 3:48 PM, Sarah Goslee sarah.gos...@gmail.com wrote:
 Hi Ray,

 I'm glad to see you here. I was going to write this up a bit more
 clearly and email it to you, but now I don't have to bother. :)

 Coincidentally, I became aware of this just recently.  When the maps package
 was created (way back in the 'new' S era), polygon() didn't add borders,
 and that is why ?map states that fill does not add borders.  A workaround is
 to change the map() option border= to myborder= (it is then used twice in
 map()).

 I though it was probably a legacy code issue.

 In fact I believe there is another workaround if you don't want to modify
 the code; use the option resolution=0 in the map() call. I.e. try (in
 Sarah's original Iowa example):

 map('county', 'iowa', fill= TRUE, col = classcolors[countycol],
 resolution=0, lty=0)

 This ensures that the polygon boundaries match up.

 Ah! That works nicely, and wasn't clear to me from the help that it
 would do so.

 Thanks!

 --
 Sarah Goslee
 http://www.functionaldiversity.org

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Coloring counties on a full US map based on a certain criterion

2012-01-13 Thread Dimitri Liakhovitski

Somewhat related question out of curiousity:
Does anyone know how often the list of the counties and county names
is updated in this package? Or is it done centrally for all packages
that deal with US counties?
Thanks!
Dimitri


On Fri, Jan 13, 2012 at 3:41 PM, Ray Brownrigg
ray.brownr...@ecs.vuw.ac.nz wrote:
 On 14/01/2012 8:04 a.m., Sarah Goslee wrote:

 On Fri, Jan 13, 2012 at 1:52 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com  wrote:

 Just to clarify, according to help about the fill argument:
 logical flag that says whether to draw lines or fill areas. If FALSE,
 the lines bounding each region will be drawn (but only once, for
 interior lines). If TRUE, each region will be filled using colors from
 the col = argument, and bounding lines will not be drawn.
 We have fill=TRUE - so why are the county borders still drawn?
 Thank you!
 Dimitri

 This prompted me to check the code:

 if fill=TRUE, map() calls polygon()
 if fill=FALSE, map() calls lines()

 But polygon() draws borders by default.

 plot(c(0,1), c(0,1), type=n)
 polygon(c(0,0,1,1), c(0,1,1,0), col=yellow)

 To not draw borders, the border argument is provided:

 plot(c(0,1), c(0,1), type=n)
 polygon(c(0,0,1,1), c(0,1,1,0), col=yellow, border=NA)

 But that fails in map():

 map('county', 'iowa', fill=TRUE, col=rainbow(20), border=NA)

 Error in par(pin = p) :
   invalid value specified for graphical parameter pin

 because border is used as a named argument in map() already, for setting
 the
 size of the plot area, so there's no way to alter the border argument
 to polygon.

 Coincidentally, I became aware of this just recently.  When the maps package
 was created (way back in the 'new' S era), polygon() didn't add borders,
 and that is why ?map states that fill does not add borders.  A workaround is
 to change the map() option border= to myborder= (it is then used twice in
 map()).

 The work-around I suggested previous (lty=0) seems to be the only
 way to deal with the problem.

 In fact I believe there is another workaround if you don't want to modify
 the code; use the option resolution=0 in the map() call. I.e. try (in
 Sarah's original Iowa example):

 map('county', 'iowa', fill= TRUE, col = classcolors[countycol],
 resolution=0, lty=0)

 This ensures that the polygon boundaries match up.

 I'll fix the border issue in the next version of maps (*not* the one just
 uploaded to CRAN, which was to add Cibola County to NM).

 Ray Brownrigg

 Sarah





-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem Installing R to SuSE 10 via RPM

2012-01-13 Thread peter dalgaard


On Jan 13, 2012, at 17:43 , Matthew Pettis wrote:

 Hi,
 
 I'm trying to install R from an rpm locally to my account (the reason I'm
 not doing it through yast/yast2/zypper is that the sys admin isn't yet
 willing to install it, and doesn't want to support it, but will help me
 support it if I install it locally -- in short, policy problems rather than
 technical).  Below is the SuSE version, Kernel version, and rpm install
 error I'm getting, as well as the error...

As others have said, installing from RPM outside the system folders could be a 
no-go. Another option is to install from source, this works fine if you have 
all the relevant packages installed (compilers, libraries, header files...). If 
you don't already have them get your sysadm to install them (say, one per day 
over a couple of weeks, until he sees the error of his ways...).

But really, the painless option for both of you is just to install it from the 
official SUSE sources with yast, keep it updated automagically, and keep all 
add-on packages in your own home directory. The sysadmin workload for R on SUSE 
(and most other Linuxen) should be essentially nil. 

 
 Can anyone help me with the error?  I'm trying to install R-base 2.14.1,
 but it is telling me that I need R-base version 2.14.1 as a dependency.  Am
 I using the wrong rpm for an installation starting from scratch?
 
 I got the rpm from:
 http://download.opensuse.org/repositories/devel:/languages:/R:/base/SLE_10/x86_64/
 
 Thanks,
 Matt
 
 
 
 
 pettis@swat:~/bin cat /etc/*-release
 SUSE Linux Enterprise Server 10 (x86_64)
 VERSION = 10
 PATCHLEVEL = 2
 
 pettis@swat:~/bin uname -a
 Linux swat 2.6.16.60-0.34-smp #1 SMP Fri Jan 16 14:59:01 UTC 2009 x86_64
 x86_64 x86_64 GNU/Linux
 
 pettis@swat:~/bin rpm -ivh R-base-devel-2.14.1-30.1.x86_64.rpm
 warning: R-base-devel-2.14.1-30.1.x86_64.rpm: Header V3 DSA signature:
 NOKEY, key ID 793371fe
 error: Failed dependencies:
R-base = 2.14.1 is needed by R-base-devel-2.14.1-30.1.x86_64
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] deviance and variance - GAM models

2012-01-13 Thread collifu

Hi all,

This is pretty basic but I am not an expert and I couldn't find anything in
the forum or my statistics book about it. I was reading a paper and the
authors were using both explained deviance and explained variance as
synonyms. They were describing a GAM regression. Is that right? I performed
an analysis in R to take a look to the output of GAM regression and I think
that:

- 'R-sq. (adj)'  is the percentage of variance explained by the regression,
i.e., I can write The regression explains xx% of variance.
- 'Deviance explained' is a simple measure of the quality of the fit but it
is not related to the percentage of variance that is explained by the
regression.

Am I right? 

Thank you so much
Ramón

--
View this message in context: 
http://r.789695.n4.nabble.com/deviance-and-variance-GAM-models-tp4293293p4293293.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] outputs from command by

2012-01-13 Thread Hai Lin

Hello R experts,

I have generated a data set below. I tried to export the object - my.pvalues 
(class is by)

as .txt (or excel file), so the output in matrix form line up nicely in .xls 
file. Do I have to convert to some other data formats before using write.table? 
I am having difficulty to get this out.


I really appreciate it if anyone here could kindly guide me.

Kevin

# Data   

a - c(rnorm(6,mean=0.5,sd=0.1),rnorm(18,mean=4,sd=1))
b - factor(rep(LETTERS[1:4], each=3, lenth=24))
grp - rep(factor(c(WT,TG)), each=12)

d - as.data.frame(cbind(a,b,grp))

my.pvalues -by(d,
  INDICES=d[,grp],
   {
  FUN=function(x)
  pairwise.t.test(x$a,x$b)$p.value
   }
   )
class(my.pvalues)
[1] by


###  Results  ###

 my.pvalues
d[, grp]: 1
  1  2  3
2 1 NA NA
3 1  1 NA
4 1  1  1
 
d[, grp]: 2
 1    2  3
2 0.7629610481   NA NA
3 0.0006906049 0.0006059707 NA
4 0.0164179404 0.0141134949 0.03267941

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with t student test

2012-01-13 Thread elisacarli21

Dear all,
I've the following data.frame

  GENDER   MEASURE01 MEASURE02 MEASURE03
MEASURE04 MEASURE05 MEASURE06 
R. Rafuse  MALE   91
   
6  8   12 
T. LeikerMALE  6  7 

1   800  
E. Bizot  FEMALE  9  8  
   
2   9   88  
K. French  MALE   7  9  
  
059  9
E. Van LanduytMALE  7 1 
 
628   9  
K. Harrell  FEMALE  6 0 
 
0831   
W. Noren  FEMALE  74
   
3 2   57   
W. WilldenMALE  99  

2  6   68   
S. Kohut   FEMALE   78  
   
2   3  6 9 

I'd like to run an indipendent t test with the MEASURE01-MEASURE06 as
dipendent variable and GENDER as grouping variable.
Namely, I wish to compare means all variables in the gorup male and female.
Could you help me to find the right R command?

Bests



--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-t-student-test-tp4293203p4293203.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] The Future of R | API to Public Databases

2012-01-13 Thread Benjamin Weber

Dear R Users -

R is a wonderful software package. CRAN provides a variety of tools to
work on your data. But R is not apt to utilize all the public
databases in an efficient manner.
I observed the most tedious part with R is searching and downloading
the data from public databases and putting it into the right format. I
could not find a package on CRAN which offers exactly this fundamental
capability.
Imagine R is the unified interface to access (and analyze) all public
data in the easiest way possible. That would create a real impact,
would put R a big leap forward and would enable us to see the world
with different eyes.

There is a lack of a direct connection to the API of these databases,
to name a few:

- Eurostat
- OECD
- IMF
- Worldbank
- UN
- FAO
- data.gov
- ...

The ease of access to the data is the key of information processing with R.

How can we handle the flow of information noise? R has to give an
answer to that with an extensive API to public databases.

I would love your comments and ideas as a contribution in a vital discussion.

Benjamin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Averaging over data sets

2012-01-13 Thread MacQueen, Don

Here is a solution that works for your small example.
It might be difficult to prepare your larger data sets to use the same
method.

db -rbind(d1,d2)
aggregate(subset(db,select=-c(subject,trt)),
by=list(subject=db$subject),mean)
## or, for example,
aggregate(subset(db,select=-c(subject,trt)), by=list(subject=db$subject,
trt=db$trt),mean)

In order for aggregate() to work, its first argument must have only
numeric columns. That is what
subset(db,select=-c(subject,trt)) does for you.

(d1 + d2)/2 did not work because d1 and d2 are data frames, not numbers.
Much more complicated, you could have done your averages one at a time,
  (d1$eat1[d1$subject=='Felipe'] + d2$eat1[d2$subjedt=='Felipe'])/2
and similarly for eat3 and John. But that is of course not practical for
larger data sets.

-Don



-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 1/12/12 10:16 PM, Felipe Nunes felipnu...@gmail.com wrote:

Hi all,

after using Amelia II to create 10 imputed data sets I need to average
them
to have one unique data that includes the average for each cell of the
variables imputed, in addition to the values for the variables not
imputed.
Such data has many variables (some numeric, other factors), and more than
2 observations. I do not know how to average them out. Any help?

Below I provide a small example:

Suppose Amelia provided two datasets:

d1 - data.frame(subject = c(Felipe, John), eat1 = 1:2, eat3 = 5:6,
trt
= c(t1, t2))

d2 - data.frame(subject = c(Felipe, John), eat1 = 3:4, eat3 = 6:7,
trt
= c(t1, t2))

I tried

(d1 + d2)/2

but I lose my factors. mean() did not work either.

The result I'd like is:

 subject  eat1  eat3   trt
1   Felipe 2  5.5 t1
2 John  3  6.5 t2

thanks,

*Felipe Nunes*
CAPES/Fulbright Fellow
PhD Student Political Science - UCLA
Web: felipenunes.bol.ucla.edu

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The Future of R | API to Public Databases

2012-01-13 Thread Sarah Goslee

R is Open Source. You're welcome to write tools, and submit your
package to CRAN. I think some part of this has been done, based
on questions to the list asking about those parts.

Personally, I've been using S-Plus and then R for 18 years, and never
required data from any of them. Which doesn't make it not important,
but suggests that public databases aren't the be-all and end-all for
R use.

Sarah

On Fri, Jan 13, 2012 at 4:14 PM, Benjamin Weber m...@bwe.im wrote:
 Dear R Users -

 R is a wonderful software package. CRAN provides a variety of tools to
 work on your data. But R is not apt to utilize all the public
 databases in an efficient manner.
 I observed the most tedious part with R is searching and downloading
 the data from public databases and putting it into the right format. I
 could not find a package on CRAN which offers exactly this fundamental
 capability.
 Imagine R is the unified interface to access (and analyze) all public
 data in the easiest way possible. That would create a real impact,
 would put R a big leap forward and would enable us to see the world
 with different eyes.

 There is a lack of a direct connection to the API of these databases,
 to name a few:

 - Eurostat
 - OECD
 - IMF
 - Worldbank
 - UN
 - FAO
 - data.gov
 - ...

 The ease of access to the data is the key of information processing with R.

 How can we handle the flow of information noise? R has to give an
 answer to that with an extensive API to public databases.

 I would love your comments and ideas as a contribution in a vital discussion.

 Benjamin


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Regression Modeling Strategies 3-Day Short Course March 2012

2012-01-13 Thread Frank Harrell

*RMS Short Course 2012* 
Frank E. Harrell, Jr., Ph.D., Professor and Chair 
Department of Biostatistics, Vanderbilt University School of Medicine 

*March 7, 8  9, 2012* 
8:00am - 4:30pm 
Vanderbilt University   Nashville Tennessee   USA

http://biostat.mc.vanderbilt.edu/RMSShortCourse

This course covers a variety of regression modeling and model validation
methods as well as the R rms package.

Please email interest to eve.a.ander...@vanderbilt.edu


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Regression-Modeling-Strategies-3-Day-Short-Course-March-2012-tp4293555p4293555.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 136 matches

Mail list logo