date:20130109

[R] [R-pkgs] Version 1.3.0 of apcluster package on CRAN

2013-01-09 Thread Ulrich Bodenhofer


Dear colleagues,

This is to inform you that Version 1.3.0 of the R package apcluster has 
been released on CRAN yesterday. We did a major extension and overhaul 
of the package. Most importantly, we added Leveraged Affinity 
Propagation in fulfillment of multiple users requests. It should now be 
much easier to cluster large data sets. Apart from this extension, the 
interfaces to apcluster() and related functions have been made more 
comfortable and flexible. For more details, see the following URLs:


http://www.bioinf.jku.at/software/apcluster/ 
http://www.bioinf.jku.at/software/apcluster/
http://cran.r-project.org/web/packages/apcluster/index.html 
http://cran.r-project.org/web/packages/apcluster/index.html


Best regards,
Ulrich



*Dr. Ulrich Bodenhofer*
Associate Professor
Institute of Bioinformatics

*Johannes Kepler University*
Altenberger Str. 69
4040 Linz, Austria

Tel. +43 732 2468 9552
Fax +43 732 2468 9511
bodenho...@bioinf.jku.at mailto:bodenho...@bioinf.jku.at
http://www.bioinf.jku.at/ http://www.bioinf.jku.at

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem getting loess tricubic weights

2013-01-09 Thread Joyce Lin

Thank you Mr Gunter!  I will look into it.


On Wed, Jan 9, 2013 at 11:59 AM, Bert Gunter gunter.ber...@gene.com wrote:

 As this does not seem to have been answered...

 I believe you may misunderstand how loess works. The tricube weights
 are part of the smoothing algorithm and change with each local fit,
 not fixed weights for observations, which is what the weights
 argument provides (and initially multiplies the tricube weight, IIRC).

 I suggest you consult

 ?predict.loess

 to get standard deviations of fitted values at existing or new points.

 -- Bert



 On Tue, Jan 8, 2013 at 12:57 AM, Joyce Lin joyceli...@gmail.com wrote:
  Hi
 
  I am trying to get the tricube weights from the loess outputs as I need
 to
  calculate an error function which requires the weight.
 
  So I have used the following example from the R:
 
  cars.lo - loess(dist ~ speed, cars, span=0.5, degree=1,
 family=symmetric)
 
  Then i try to get the weights:
 
  cars.lo$weights
   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1
  1 1 1 1 1 1 1 1 1 1 1 1 1 1
 
  The results are all 1 so i dont think that the tricube weighting are set.
  May I know what other parameters do i need to tweak to set the weights to
  tricube weights? Thank you.
 
 
  --
  Best regards
  Joyce Lin
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:

 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm




-- 
Best regards
Joyce Lin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot residuals per factor

2013-01-09 Thread arun

Hi,
I forgot to mention:
levels(dat1$d)
#[1] 1 2 3 4 5

Suppose, if I use different levels
library(car)
dat1$d1-recode(dat1$d,1='A';2='B';3='C';4='D';5='E')
 levels(dat1$d1) # check the order of the levels
#[1] A B C D E
mypath-file.path(/home/arun/Trial1,paste(catalin_,LETTERS[1:5],.jpg,sep=))
  #change the file.path according to your system 

for(i in seq_along(mypath)){
 jpeg(file=mypath[i])
 par(mfrow=c(2,2))
 line-lm(y~x,data=dat1[as.numeric(dat1$d1)==i,])
  plot(line,which=1:4)# if you want only residual vs. fitted, change which=1
  #abline(0,0)
  dev.off()
  } 

In case you need to change the order of levels
 dat1$d1-factor(dat1$d1,levels=c(C,D,E,A,B))
 levels(dat1$d1)
#[1] C D E A B

mypath-file.path(/home/arun/Trial1,paste(catalin_,LETTERS[c(3,4,5,1,2)],.jpg,sep=))
 for(i in seq_along(mypath)){
 jpeg(file=mypath[i])
  par(mfrow=c(2,2))
  line-lm(y~x,data=dat1[as.numeric(dat1$d1)==i,])
   plot(line,which=1:4)
   #abline(0,0)
   dev.off()
   } 
A.K.







- Original Message -
From: catalin roibu catalinro...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Tuesday, January 8, 2013 4:22 AM
Subject: [R] plot residuals per factor

Dear R-users,
I want to plot residuals vs fitted for multiple groups with ggplot2.
I try this code, but unsuccessful.
library(plyr)
models-dlply(dat1,d,function(df)
mod-lm(y~x,data=df)

  ggplot(models,aes(.fitted,.resid), color=factor(d))+
  geom_hline(yintercept=0,col=white,size=2)+
  geom_point()+
  geom_smooth(se=F)

-- 
---
Catalin-Constantin ROIBU
Forestry engineer, PhD
Forestry Faculty of Suceava
Str. Universitatii no. 13, Suceava, 720229, Romania
office phone     +4 0230 52 29 78, ext. 531
mobile phone   +4 0745 53 18 01
                       +4 0766 71 76 58
FAX:                +4 0230 52 16 64
silvic.usv.ro

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to use 'glmnet' or 'lars' package to select features?

2013-01-09 Thread Zhong Wang

Hi all,

I am a newbie of statistics. I want to make lasso feature selection on a
bioinfomatics data set. I know I can use 'glmnet' or 'lars' package to do
that. However, the glmnet() and lars() function return a model object. I
don't know how to use this object to make feature selection. What should I
do next?

Thanks,
Zhong

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] t-test behavior given that the null hypothesis is true

2013-01-09 Thread Pavlos Pavlidis

Dear all,
I observer a strange behavior of the pvalues of the t-test under the null
hypothesis. Specifically, I obtain 2 samples of 3 individuals each from a
normal distribution of mean 0 and variance 1. Then, I calculate the pvalue
using the t-test (var.equal=TRUE, samples are independent). When I make a
histogram of pvalues I see that consistently the bin of the smallest
pvalues has a lower frequency. Is this a known behavior of the t-test or
it's a kind of bug/random number generation problem?

kind regards,
idaios

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Applying a user-defined function

2013-01-09 Thread arun

Hi Pradip,
I didn't check the mode at that time.   It generated a matrix 
test1$newcols- sapply() 
You can do this:
test2-data.frame(test1[,-7],test1$newcols)

str(test2)
#'data.frame':    51 obs. of  9 variables:
# $ ObtMj_P   : num  49.6 55 52.5 50.5 51.1 55.1 56.3 53.6 53.5 52.7 ...
# $ ObtMj_SE  : num  1.37 1.41 1.56 1.22 0.65 1.26 1.28 1.3 1.22 0.67 ...
# $ ExpPrevMed_P  : num  80 81.8 79.6 78 80.5 81.7 85 79.5 76.2 78.9 ...
# $ ExpPrevMed_SE : num  0.91 1.08 1.2 0.78 0.53 1.03 0.93 1.04 1.03 0.52 ...
# $ ParMon_P  : num  12.1 12.4 15.8 12.8 13 12.1 14.6 14.7 14.3 14.1 ...
# $ ParMon_SE : num  0.68 0.9 1.08 0.72 0.41 0.72 0.77 0.97 1.13 0.45 ...
# $ ObtMj_P.1 : Factor w/ 5 levels [42,48.7],(48.7,50.9],..: 2 5 3 2 3 
5 5 4 4 3 ...
# $ ExpPrevMed_P.1: Factor w/ 5 levels [76.2,79.2],..: 2 3 2 1 2 3 5 2 1 1 ...
# $ ParMon_P.1    : Factor w/ 5 levels [11.9,12.6],..: 1 1 5 2 2 1 4 4 3 3 ..


levels(test2[,7])
#[1] [42,48.7]   (48.7,50.9] (50.9,52.8] (52.8,54.2] (54.2,58.7]
Do you want to replace this with 1:5?
levels(test2[,8])
#[1] [76.2,79.2] (79.2,80.5] (80.5,81.9] (81.9,83.5] (83.5,85]  
 as.numeric(test2[,7])
 #[1] 2 5 3 2 3 5 5 4 4 3 3 2 1 3 2 1 1 4 2 5 4 5 3 3 1 1 5 1 4 5 4 5 4 3 1 2 1 
4
#[39] 4 5 2 1 2 1 1 5 3 4 3 2 2


A.K.



- Original Message -
From: Muhuri, Pradip (SAMHSA/CBHSQ) pradip.muh...@samhsa.hhs.gov
To: R help r-help@r-project.org
Cc: 
Sent: Tuesday, January 8, 2013 10:06 PM
Subject: Re: [R] Applying a user-defined function


Hello List,

Last time, Arun's following solution worked to create 3 new columns (1,3,5).  
Now how would I tweak this function to create corresponding (additional) 
columns (7,8,9) of mode factor (levels = 1,2,3,4,5)?

Thanks for your continued support.

Pradip

### cut and paste from the reproducible example
CutQuintiles - function( x) {
  cut (x,quantile (x, (0:5/5)),include.lowest=TRUE)
}

#apply the CutQuintile () on every odd-numbered columns of the test1 data 
frame
test1$newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles)

# name 3 new columns based on the odd-numbered columns
names(test1$newcols) - paste (names(test1 [, seq (1,6,2)]), _cat)




## Reproducible Example


test1 - read.table (text=
State,ObtMj_P,ObtMj_SE,ExpPrevMed_P,ExpPrevMed_SE,ParMon_P,ParMon_SE
Alabama,49.60,1.37,80.00,0.91,12.10,0.68
Alaska,55.00,1.41,81.80,1.08,12.40,0.90
Arizona,52.50,1.56,79.60,1.20,15.80,1.08
Arkansas,50.50,1.22,78.00,0.78,12.80,0.72
California,51.10,0.65,80.50,0.53,13.00,0.41
Colorado,55.10,1.26,81.70,1.03,12.10,0.72
Connecticut,56.30,1.28,85.00,0.93,14.60,0.77
Delaware,53.60,1.30,79.50,1.04,14.70,0.97
District of Columbia,53.50,1.22,76.20,1.03,14.30,1.13
Florida,52.70,0.67,78.90,0.52,14.10,0.45
Georgia,52.50,1.15,79.30,1.02,15.90,0.98
Hawaii,49.40,1.33,83.80,1.12,16.00,1.06
Idaho,48.30,1.23,82.40,0.99,11.90,0.74
Illinois,52.70,0.63,81.00,0.46,13.60,0.40
Indiana,49.60,1.16,80.90,0.91,12.60,0.82
Iowa,46.30,1.37,82.10,1.01,13.60,0.87
Kansas,44.30,1.43,79.20,0.98,12.90,0.79
Kentucky,52.90,1.37,78.70,1.05,14.60,0.98
Louisiana,49.70,1.23,76.80,1.06,14.50,0.76
Maine,55.60,1.44,82.90,0.93,16.70,0.83
Maryland,53.90,1.46,83.60,0.95,14.00,0.80
Massachusetts,55.40,1.41,81.00,1.15,14.70,0.80
Michigan,52.40,0.62,80.50,0.47,15.00,0.43
Minnesota,51.50,1.20,84.40,0.87,14.40,0.86
Mississippi,43.20,1.14,76.60,0.91,12.30,0.78
Missouri,48.70,1.20,80.30,0.90,13.70,0.12
Montana,56.40,1.16,83.70,0.95,12.10,0.68
Nebraska,45.70,1.51,83.40,0.95,12.40,0.90
Nevada,54.20,1.17,80.60,1.07,15.80,1.08
New Hampshire,56.10,1.30,83.30,0.93,12.80,0.72
New Jersey,53.20,1.45,83.70,0.95,13.00,0.41
New Mexico,57.60,1.34,78.90,1.03,12.10,0.72
New York,53.70,0.67,82.60,0.48,14.60,0.77
North Carolina,52.20,1.26,81.90,0.84,14.70,0.97
North Dakota,48.60,1.34,84.20,0.88,14.30,1.13
Ohio,50.90,0.61,82.70,0.49,14.10,0.45
Oklahoma,47.20,1.42,78.80,1.33,15.90,0.98
Oregon,54.00,1.35,80.60,1.14,16.00,1.06
Pennsylvania,53.00,0.63,79.90,0.47,11.90,0.74
Rhode Island,57.20,1.20,79.50,1.02,13.60,0.40
South Carolina,50.50,1.21,79.50,0.95,12.60,0.82
South Dakota,43.40,1.30,81.70,1.05,13.60,0.87
Tennessee,48.90,1.35,78.40,1.35,12.90,0.79
Texas,48.70,0.62,79.00,0.48,14.60,0.98
Utah,42.00,1.49,85.00,0.93,14.50,0.76
Vermont,58.70,1.24,83.70,0.84,16.70,0.83
Virginia,51.80,1.18,82.00,1.04,14.00,0.80
Washington,53.50,1.39,84.10,0.96,14.70,0.80
West Virginia,52.80,1.07,79.80,0.93,15.00,0.43
Wisconsin,49.90,1.50,83.50,1.02,14.40,0.86
Wyoming,49.20,1.29,82.00,0.85,12.30,0.78
, sep=,, row.names='State',  header=TRUE, as.is=TRUE)

# change names () to lower case

names (test1) - tolower (names (test1))

#Write a cut/quantile function to apply on different columns of the data frame

CutQuintiles - function( x) {
  cut (x,quantile (x, (0:5/5)),include.lowest=TRUE)
}

#apply the CutQuintile () on every odd-numbered columns of the test1 data 
frame
test1$newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles)

# name 3 new columns based on the odd-numbered columns
names(test1$newcols) - paste

Re: [R] plot x-axis DateTime NOT evenly spaced

2013-01-09 Thread ishi soichi

Thanks! it was really helpful.

soichi


2013/1/7 arun smartpink...@yahoo.com

 Hi,
 Try this:
 dat1-read.table(text=1 2012-07-01 00:57:54 +0900156
 2 2012-07-01 01:07:41 +0900587
 3 2012-07-01 01:09:31 +0900110
 4 2012-07-01 01:18:42 +0900551
 5 2012-07-01 01:39:01 +09001219
 6 2012-07-01 01:40:40 +0900
 99,sep=,header=FALSE,stringsAsFactors=FALSE)

 dat2-data.frame(date=paste(dat1[,2],dat1[,3],paste0(+,dat1[,4]),sep=
 ),Interval=dat1[,5])
 dat2$date-as.POSIXct(dat2$date,%Y-%m-%d %H:%M:%S)
 library(xts)
  dat3-xts(dat2[,-1],order.by=dat2[,1])
 plot(dat3)
 A.K.




 - Original Message -
 From: ishi soichi soichi...@gmail.com
 To: r-help r-help@r-project.org
 Cc:
 Sent: Monday, January 7, 2013 3:55 AM
 Subject: [R] plot x-axis DateTime NOT evenly spaced

 R-64 latest

 Hi. I am trying to plot a set of csv data, which looks like

  head(interval)
date inteval
 1 2012-07-01 00:57:54 +0900 156
 2 2012-07-01 01:07:41 +0900 587
 3 2012-07-01 01:09:31 +0900 110
 4 2012-07-01 01:18:42 +0900 551
 5 2012-07-01 01:39:01 +09001219
 6 2012-07-01 01:40:40 +0900  99

 as you can see, more than one event happens each day, and they are not
 evenly spaced.  Obviously hours, minutes and seconds are important for the
 plot.

 I tried

 interval$date - as.Date(interval$date, %Y-%m-%d %H:%M:%S +0900)

 but this chops the time off.

 Could anyone show me how to plot data with x values as Date(or Time)
 objects?

 soichi

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to label two figures in the same chunk independently with knitr

2013-01-09 Thread Francesco Sarracino

Dear Yihui,

thanks a lot for your kind reply. Your solution is very elegant and
versatile.
However, there is a point that is obscure to me and I didn't manage to
fully understand them after looking at the Knitr manual and graphic manual.
The issue concerns the hook:

*knit_hooks$set(par = function(before, options, envir) {
  if (before) par(mar = c(4, 4, .1, .1))
})*

why do you set par as a function?
moreover, below you write:

*par(bg=rgb(runif(1), runif(1), runif(1)))*

does this mean that* before = rgb(runif(1))*;* options = runif(1)*
and*envir = runif(1)
*?

and what does this produce? I don't understand what's going on, can you
please help me or address me to some documentation?

thanks in advance for your kind help,
f.











On 8 January 2013 18:03, Yihui Xie x...@yihui.name wrote:

 All you mentioned are possible; knitr has very comprehensive support
 to figures in LaTeX, and what you want in this case is subfigures
 (\usepackage{subfig}); here is an example:

 https://github.com/yihui/knitr-examples/blob/master/067-graphics-options.Rnw
 (search for 'fig.subcap' for the relevant chunk)

 And here is a preview: http://i.imgur.com/4lKpw.png

 Regards,
 Yihui
 --
 Yihui Xie xieyi...@gmail.com
 Phone: 515-294-2465 Web: http://yihui.name
 Department of Statistics, Iowa State University
 2215 Snedecor Hall, Ames, IA


 On Tue, Jan 8, 2013 at 4:17 AM, Francesco Sarracino
 f.sarrac...@gmail.com wrote:
  Dear R helpers,
 
  I am using knitr to run analysis with R and edit my document with Latex.
 I
  am wondering whether there is a way to include 2 or more pictures per
 chunk
  and being able to refer them in the text independently and eventually
  whether it is possible to give them different captions. Let me give you
 an
  example.Rnw:
 
  \documentclass{article}
  \title{Example}
  \author{FS}
  \begin{document}
  \maketitle
 
  I put some text here. I want  to plot to charts in the same figure and
  label them independently.
  stat, echo = FALSE, results = 'hide'=
  ii - 2000:2011
  xx - rnorm(12,0,1)
  yy - rnorm(12,0,1)
  pm - data.frame(ii,zz,yy
  @
 
  Now I generate the two pictures and put them into the same chunk with the
  option out.width set to .49 so that knitr places the two charts side by
  side:
  fig:example, echo = FALSE, out.width=.49\\linewidth, fig.cap=this is
  an example=
  plot(ii,xx, type = l)
  plot(ii,yy, type = l, lty = 2)
  @
 
  Finally, I want the reader to look at the figure on the left.
  \end{document}
 
  How can I do this? If I refer to  \ref{fig:example} I will get the number
  of the figure, but of the chart on the left.
  Eventually, is it possible to have separate captions for each chart?
  Thanks in advance for your kind help,
  f.
 
 
  --
  Francesco Sarracino, Ph.D.
  https://sites.google.com/site/fsarracino/
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
Francesco Sarracino, Ph.D.
https://sites.google.com/site/fsarracino/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] deparse substitute

2013-01-09 Thread Berry Boessenkool



Hi,

I'm writing a function that needs the input names (as characterstrings) as part 
of the output.
With deparse(substitute( ) ) that works fine, until I replace all zeros with 
0.001 (log is calculated at some time):

tf - function(input) { input[input==0] - 0.001 ;   deparse(substitute(input)) 
}
myguess - 42
tf(myguess) # not myguess, but 42


Now when I extract the input names before replacing the zeros, this works:

tf - function(input) { out - deparse(substitute(input)) ;   input[input==0] 
- 0.001 ;   out }
tf(myguess) # correct: myguess
myguess - 0 ; tf(myguess) # ditto


While I did find a workaround, I'm still wondering why this happens.
Any hints on where to start reading?

Thanks ahead,
Berry

PS: R version 2.15.1 (2012-06-22) -- Roasted Marshmallows
Windows 7  - Platform: x86_64-pc-mingw32/x64 (64-bit)



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] t-test behavior given that the null hypothesis is true

2013-01-09 Thread Ted Harding

On 09-Jan-2013 08:50:46 Pavlos Pavlidis wrote:
 Dear all,
 I observer a strange behavior of the pvalues of the t-test under
 the null hypothesis. Specifically, I obtain 2 samples of 3
 individuals each from a normal distribution of mean 0 and variance 1.
 Then, I calculate the pvalue using the t-test (var.equal=TRUE,
 samples are independent). When I make a histogram of pvalues
 I see that consistently the bin of the smallest pvalues has a
 lower frequency. Is this a known behavior of the t-test or it's
 a kind of bug/random number generation problem?
 
 kind regards,
 idaios

Using the following code, I did not observe the behavious you describe.
The histograms are consistent with a uniform distribution of the
P-values, and the lowest bin for the P-values (when the code is
run repeatedly) is not consistently lower (or higher, or anything
else) than the other bins.

## My code:
N - 1
Ps - numeric(N)
for(i in (1:N)){
  X1 - rnorm(3,0,1) ; X2 - rnorm(3,0,1)
  Ps[i] - t.test(X1,X2,var.equal=TRUE)$p.value
}
hist(Ps)


If you would post the code you used, the reason why you are observing
this may become more evident!

Hoping this helps,
Ted.

-
E-Mail: (Ted Harding) ted.hard...@wlandres.net
Date: 09-Jan-2013  Time: 10:29:21
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] select partial name and full name columns

2013-01-09 Thread Irucka Embry

Hi, I have the following function:

getDataFromDVFileCustom - function (file, hasHeader = TRUE, separator =
\t) 
{
DVdatatmp - as.matrix(read.table(file, sep = \t, fill = TRUE,
comment.char = #, as.is = TRUE, stringsAsFactors = FALSE, na.strings =
NA))
DVdatatmper - as.matrix(DVdatatmp[ , c(datetime,
grep(^_00060_3, colnames(DVdatatmp)))])
retval - as.data.frame(DVdatatmper, colClasses = c(character), fill =
TRUE, comment.char = #, stringsAsFactors = FALSE)
if (ncol(retval) == 2) {
names(retval) - c(dateTime, value)
}
else if (ncol(retval) == 3) {
names(retval) - c(dateTime, value, code)
}
if (dateFormatCheck(retval$dateTime)) {
retval$dateTime - as.Date(retval$dateTime)
}
else {
retval$dateTime - as.Date(retval$dateTime, format = %m/%d/%Y)
}
retval$value - as.numeric(retval$value)
return(retval)
}

The function gives me this error:
getDataFromDVFileCustom(file)
Error in as.matrix(DVdatatmp[, c(datetime, grep(^_00060_3,
colnames(DVdatatmp)))]) : 
subscript out of bounds

I am trying to only select 3 columns (datetime and then two partial name
columns that end in 00060_3 and 00060_3_cd. Each file that I
will be reading into the function has a different number of columns and
a different prefix in front of 00060_3 and 00060_3_cd. I have
searched online and tried those possible solutions, but they did not
work for my function and data.

What is the best way to select those 3 columns only?

Thank-you.

Irucka Embry 


span id=m2wTlpfont face=Arial, Helvetica, sans-serif size=2 
style=font-size:13.5px___BRGet
 the Free email that has everyone talking at a href=http://www.mail2world.com 
target=newhttp://www.mail2world.com/abr  font color=#99Unlimited 
Email Storage #150; POP3 #150; Calendar #150; SMS #150; Translator #150; 
Much More!/font/font/span
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] deparse substitute

2013-01-09 Thread Duncan Murdoch


On 13-01-09 5:03 AM, Berry Boessenkool wrote:



Hi,

I'm writing a function that needs the input names (as characterstrings) as part 
of the output.
With deparse(substitute( ) ) that works fine, until I replace all zeros with 
0.001 (log is calculated at some time):

tf - function(input) { input[input==0] - 0.001 ;   deparse(substitute(input)) 
}
myguess - 42
tf(myguess) # not myguess, but 42


Now when I extract the input names before replacing the zeros, this works:

tf - function(input) { out - deparse(substitute(input)) ;   input[input==0] 
- 0.001 ;   out }
tf(myguess) # correct: myguess
myguess - 0 ; tf(myguess) # ditto


While I did find a workaround, I'm still wondering why this happens.
Any hints on where to start reading?


Probably the R Language definition, section 2.1.8.  The basic 
explanation for the behaviour you see is that deparse(substitute(input)) 
acts on the input promise object, looking at its expression slot. It's 
not doing any magic examination of the context in which it was 
originally defined.


Once you modify it, it is no longer a promise, and so it has no 
expression slot.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R-Forge package check error. Package dependencies on linux platform

2013-01-09 Thread Mengsteab Aregay

Dear All,
I've got an error in R-Forge package check. when it checks with windows and
mac platform it doesn't give an error except one note which is regarding to
maintainer. However, it doesn't check correctly regarding to Linux platform
and gives me the following error. Due to this error I couldn't submit my
package to cran. Please help me how to solve the problem.

BcDiag log file (check_x86_64_linux)
Sun Dec 30 16:15:16 2012: Checking package BcDiag (SVN revision 7)
...* using log directory
/mnt/building/build_2012-12-30-16-05/RF_PKG_CHECK/PKGS/BcDiag.Rcheck*
using R version 2.15.2 Patched (2012-12-14 r61333)* using platform:
x86_64-unknown-linux-gnu (64-bit)* using session charset: UTF-8*
checking for file BcDiag/DESCRIPTION ... OK* this is package
BcDiag version 1.0* checking CRAN incoming feasibility ...
NOTEMaintainer: Aregay Mengsteab New submission* checking package
namespace information ... OK* checking package dependencies ...
ERRORPackages required but not available:  isa2 fabiaPackages
suggested but not available for checking:  isa2 fabia

Regards,

Michael

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error in a abline loop

2013-01-09 Thread PIKAL Petr

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of arun
 Sent: Tuesday, January 08, 2013 3:30 PM
 To: Elaine Kuo
 Cc: R help
 Subject: Re: [R] error in a abline loop
 
 HI Elaine,
 
 In the data you sent to me, it had 5 levels for skin_color.
 data1-read.csv(skin_color.csv,sep=\t)
 data1$skin_color-factor(data1$skin_color)
 levels(data1$skin_color)
 #[1] 1 2 3 4 5
 
 
  mypath-
 file.path(/home/arun/Trial1,paste(Elaine_,1:5,.jpg,sep=))
 #change the file.path according to your system

Instead of multiple files you can create multipage pdf

pdf(myfile.pdf)

  for(i in 1:5){
# jpeg(file=mypath[i])
  plot(body_weight~body_length,data=data1[data1$skin_color==i,])
 line-lm(body_weight~body_length,data=data1[data1$skin_color==i,])
  abline(line,col=c(yellow,blue,green,orange,red)[i],lwd=2)
}
dev.off()

see ?pdf for details

Regards
Petr

#  }
 
 #or
 
 lapply(seq_along(mypath),function(i) {jpeg(file=mypath[i])
  line-
 lm(body_weight~body_length,data=data1[data1$skin_color==i,])
 
 plot(body_weight~body_length,data=data1[data1$skin_color==i,])
 
 abline(line,col=c(yellow,blue,green,orange,red)[i],lwd=2)
  dev.off()
   })
 
 
 A.K.
 
 
 
 
 - Original Message -
 From: Elaine Kuo elaine.kuo...@gmail.com
 To: arun smartpink...@yahoo.com
 Cc:
 Sent: Monday, January 7, 2013 9:48 PM
 Subject: Re: [R] error in a abline loop
 
 Hello arun
 
 Thank you always.
 Please kindly help the attached data for your reference.
 
 Elaine
 
 
 On Tue, Jan 8, 2013 at 10:00 AM, arun smartpink...@yahoo.com wrote:
  HI,
 
  A possible guess ( with no data):
  for (i in 1:7) {
      subs - data$skin_color==levels(data$skin_color)[i]
      line-lm(body_weight~body_length, data=subset(data, subset=subs),
      abline(line,col=c(yellow,chocolate1,darkorange2,
  red3,saddlebrown,coral4,grey38)[i],lwd=2) ) #closing
 parenthesis for lm( was missing
      }
 
 
  A.K.
 
 
 
  - Original Message -
  From: Elaine Kuo elaine.kuo...@gmail.com
  To: r-help@r-project.org
  Cc:
  Sent: Monday, January 7, 2013 8:23 PM
  Subject: [R] error in a abline loop
 
  Hello
 
  I have data of body length and body weight of people of different
 skin colors.
 
  I tried to write a code to plot body length and body weight according
  to the skin colors.
  (Thanks for Petr's advice so far.)
 
  A loop is used but an error shows up in the following code.
  It says:
  unexpected '}' in
  
 red3,red3,saddlebrown,coral4,chocolate4,darkblue,navy,g
 r
 ey38)[i],lwd=2)
      }
 
  Please kindly advise how to modify the code.
  Thank you.
 
  The code
    data -read.csv(H:/skincolor.csv,header=T)
 
    # graph
      par(mai=c(1.03,1.03,0.4,0.4))
 
      plot(data$body_weight, data$body_length,
      xaxp=c(0,200,4),
      yaxp=c(0,200,4),
      type=p,
      pch=1,lwd=1.0,
      cex.lab=1.4, cex.axis=1.2,
      font.axis=2,
      cex=1.5,
      las=1,
      bty=l,col=c(yellow,chocolate1,darkorange2,
  red3,saddlebrown,coral4,grey38)[as.numeric(data$skin_color)])
 
 
 
 #~
 ~
 ~
      ##
      for (i in 1:7) {
      subs - data$skin_color==levels(data$skin_color)[i]
      line-lm(body_weight~body_length, data=subset(data, subset=subs),
      abline(line,col=c(yellow,chocolate1,darkorange2,
  red3,saddlebrown,coral4,grey38)[i],lwd=2)
      }
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] update.packages problem

2013-01-09 Thread Terry Therneau

I've updated to R-devel on my development machine and have lots of packages.  The 
update.packages() script ends up with 33 failures, all due to out-of-order reloading.  
That is, if package abc depends on package xyz, then the reinstall of abc fails with a 
message that version of xyz is built before R 3.0.0: please re-install it.


So I ran it a second time, and got 32 failures.  There should be a better way.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] update.packages problem

2013-01-09 Thread Prof Brian Ripley


On 09/01/2013 12:52, Terry Therneau wrote:

I've updated to R-devel on my development machine and have lots of
packages.  The update.packages() script ends up with 33 failures, all
due to out-of-order reloading. That is, if package abc depends on
package xyz, then the reinstall of abc fails with a message that
version of xyz is built before R 3.0.0: please re-install it.

So I ran it a second time, and got 32 failures.  There should be a
better way.


There is, on the help page!

checkBuilt: If ‘TRUE’, a package built under an earlier minor version
  of R is considered to be ‘old’.

As the NEWS file says, right at the top

  Packages need to have been installed under this version of R.
  (Pro tem, this is considered to be R-devel from April 2012.)

so my guess is that 'xyz' was not even installed under R-devel but 2.15.x.

We don't usually discuss development versions of R on R-help.


PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems regarding the latest version of the package BRugs

2013-01-09 Thread John Kane

I think we a much clearer statement of the problem
This link provides some suggestions on how to frame the problem
https://github.com/hadley/devtools/wiki/Reproducibility





John Kane
Kingston ON Canada


 -Original Message-
 From: mcmoumi...@gmail.com
 Sent: Wed, 9 Jan 2013 08:55:25 +0530
 To: r-help@r-project.org
 Subject: [R] problems regarding the latest version of the package BRugs
 
 Respected Sir/Madam,
 I am a research scholar of Department
 of Statistics, University Of Calcutta. I had downloaded the latest
 version
 of BRugs, and installed it in R 2.15.1 both in 32 and 64 bits with the
 help of openBUGS 3.2.2. My problem is that one of the programmes which

What programme?  An R package, one you have written youself???

Please provide code and some sample data 
Please supply some sample data and code. 
 The easiest way to supply data  is to use the dput() function.  Example with 
your file named testfile: 
dput(testfile) 
Then copy the output and paste into your email.  For large data sets, you can 
just supply a representative sample.  Usually, 
dput(head(testfile, 100)) will be sufficient.  
 

 requires the package BRugs is giving me an error given below:
   Error in samplesSize(node) :
   node must be a scalar variable from the model
 
 However, other
 persons , who are running the programme , using the earlier version of
 this
 package , can run the programme without any error.  I am unable to find
 where the actual problem lies. I would be very much obliged if you kindly
 give me a solution of the problem.
 Thanking you.
   Moumita Chatterjee.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Need an advise for bayesian estimate

2013-01-09 Thread kyong park

Hi R bayesians,

I need an advise how to resolve the two different estimates applying a
traditional glm (TG) and a bayes glm (BG), and different results depending
on the data formats of response data and  the prior specs using bayesglm in
 R. I'm not familiar with bayes estimate and my colleague asked me to look
into this  because the EPA from France reported a quite different estimates
for the follwoing ethylene data applying bayes method using MCSIM. As seen
below glm give same results regardless of response data format, i.e.,
two-column or binary formats, but bayesglm would give different results.
The result from French report is -91.78+8*lnC+6.055*lnT which lie
between prior.df=1 and 2 appling binary data format in R.

My question are as follows:

1. What is advantage using bayes estimate? Is it better for small samples?
2. How to resolve different estimates depending on the format of response
data, and the prior specs (Ex: prior.df)?
3.Should we use interval estimate rather than point estimate for BG?

Two-Column format:

cppm Tmin lnClnT Death   Number
1 18502407.522945.480645 5
2 16372407.400625.480644 5
3 14432407.274485.480641 5
4 10212406.928545.480640 5
5 4827  60 8.481984.094345 5
6 4202  60 8.343324.094341 5
7 4064  60 8.309924.094345 5
8 3966  60 8.285514.094342 5
9 3609  60 8.191194.094340 5

Binary data format:

   cppm Tmin lnC  lnT  resp
1  1850  2407.522945.48064 1
2  1850  2407.522945.48064 1
3  1850  2407.522945.48064 1
4  1850  2407.522945.48064 1
5  1850  2407.522945.48064 1
6  1637  2407.400625.48064 1
7  1637  2407.400625.48064 1
8  1637  2407.400625.48064 1
9  1637  2407.400625.48064 1
10 1637 2407.400625.48064 0
11 1443 2407.274485.480641
12 1443 2407.274485.480640
13 1443 2407.274485.480640
14 14432407.27448 5.480640
attach(ehtylene)
DL-cbind(Death,Alive=Number-Death)
Call:
glm(formula = DL ~ lnC + lnT, family = binomial(link = probit),
data = ethylene)
Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept) -145.156 43.668  -3.324 0.000887 ***
lnC   12.972  3.918   3.311 0.000931 ***
lnT9.122  2.736   3.335 0.000854 ***
Using binary data:
 Call:
glm(formula = resp ~ lnC + lnT, family = binomial(link = probit),
data = ethylene.mod)
Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept) -145.157 43.670  -3.324 0.000887 ***
lnC   12.972  3.918   3.311 0.000931 ***
lnT9.122  2.736   3.334 0.000855 ***
Using bayesglm with two-column data:

 summary(result3)
Call:
bayesglm(formula = DL ~ lnC + lnT, family = binomial(link = probit),
data = ethylene)
Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept) -134.971 17.490  -7.717 1.19e-14 ***
lnC   12.060  1.570   7.680 1.59e-14 ***
lnT8.485  1.095   7.751 9.11e-15 ***
Using bayesglm with binary data:
 Call:
bayesglm(formula = resp ~ lnC + lnT, family = binomial(link = probit),
data = ethylene.mod)
Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)  -98.477 26.919  -3.658 0.000254 ***
lnC8.792  2.423   3.628 0.000286 ***
lnT6.208  1.681   3.694 0.000221 ***

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to count A, C, T, G in each row in a big data.frame?

2013-01-09 Thread jim holtman

forgot the data.  this will count the characters; you can add logic
with 'table' to count groups


x -
structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565,
Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238,
Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581,
Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373,
Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685,
Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L,
20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L,
36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L,
2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L,
3619538L), strand = c(+, +, +, +, +, +, +, +,
+, +, +, +, +, +, +, +, +, +, +, +),
X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT,
CC, GG, AG, AG, AG, TT, CC, AG, CC, AA,
GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC,
CC, TT, CC, GG, GG, AG, AG, TT, CC, AG,
CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC,
TT, CC, CC, TT, CC, GG, GG, GG, GG, GT,
CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT,
TT, CC, TT, CC, CC, TT, CC, GG, AA, AG,
GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG,
TC, TT, CC, TC, CC, CC, TT, CC, GG, AA,
GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA,
CC, TT, CC, CC, CC, CC, TT, CC, AG, GG,
AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA,
CC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GA, GA,
GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG,
CT, TT, CC, CT, CC, CC, TT, CC, GG, AG,
GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA,
CT, TT, CC, CT, CC, CC, TT, CC, GG, GA,
GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA,
TC, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG,
CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA,
CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GG, GG, GT, TC, AG, CC, AA, AA, AG), X2534 = c(GA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
AG, GG, TG, CC, AG, TC, AA, AA, AA), X2280 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
AG, GG, TT, CC, GG, CC, AA, AA, AG), X2316 = c(AA,
CC, TT, CC, CC, CC, CC, TT, CC, AG, AA,
AA, AG, TT, TC, GG, CT, AA, GG, GG), X2339 = c(AA,
CC, TT, CC, CC, CC, CC, TT, CC, GA, AA,
GG, GG, GT, CT, GG, TT, AA, AA, AG), X2331 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
GG, GG, TT, CC, GG, CC, AA, AA, AG), X2343 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
GG, GG, TT, CT, GG, CC, AA, AA, GA), X2352 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
GG, GG, TT, CC, GG, CC, AA, GA, AG), X2293 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
AA, GG, TT, TC, AA, CT, AA, AA, AA), X2338 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
AG, GG, TT, TC, AG, TC, AA, AA, GA), X2449 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, AG,
AA, GG, TT, CC, AA, TC, AA, AA, GA), X2296 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GA, GG,
AG, GG, TG, TC, AG, CC, AA, AA, AA), X2453 = c(AG,
TT, TT, CC, TT, CC, CC, TT, CC, AG, GG,
GA, GG, GT, CT, GA, CT, AA, AA, GA), X2460 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, AG, GG,
GG, GG, TG, CT, GG, CC, AA, AA, AA), X2474 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GA, AG,
AG, GG, TT, CC, AG, TC, AA, AA, GA), X2603 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, AG,
AG, GG, TT, CC, AG, CC, AA, AA, GA), X2282 = c(GA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
AA, GG, TT, TT,

Re: [R] t-test behavior given that the null hypothesis is true

2013-01-09 Thread Ted Harding

Ah! You have aqssigned a parameter equal.var=TRUE, and equal.var
is not a listed paramater for t.test() -- see ?t.test :

  t.test(x, y = NULL,
alternative = c(two.sided, less, greater),
mu = 0, paired = FALSE, var.equal = FALSE,
conf.level = 0.95, ...)

Try it instead with var.equal=TRUE, i.e. in your code:
  for(i in 1:k){
rv.t.pvalues[i] - t.test(rv[i, 1:(c/2)], rv[i, (c/2+1):c],
  ##equal.var=TRUE, alternative=two.sided)$p.value
var.equal=TRUE, alternative=two.sided)$p.value
  }

When I run your code with equal.var, I indeed repeatedly see
the deficient bin for the lowest P-values that you observed.
When I run your code with var.equal I do not see it.

The explanation is that, since equal.var is not a recognised
parameter for t.test(), it has assumed the default value FALSE
for var.equal, and has therefore (since it is a 2-sample test)
adopted the Welch/Satterthwaite procedure:

  var.equal: a logical variable indicating whether to treat
the two variances as being equal. If 'TRUE' then the
pooled variance is used to estimate the variance
otherwise the Welch (or Satterthwaite) approximation
to the degrees of freedom is used.

This has the effect of somewhat adapting the test procedure to
the data, so that extreme (i.e. small) values of P are even
rarer than they should be.

With best wishes,
Ted.

On 09-Jan-2013 13:24:59 Pavlos Pavlidis wrote:
 Hi Ted,
 thanks for the reply. I use a similar code which you can see below:
 
 k - 1
 c - 6
 rv - array(NA, dim=c(k, c) )
 for(i in 1:k){
   rv[i,] - rnorm(c, mean=0, sd=1)
 }
 
 rv.t.pvalues - array(NA, k)
 
 for(i in 1:k){
   rv.t.pvalues[i] - t.test(rv[i, 1:(c/2)], rv[i, (c/2+1):c],
 equal.var=TRUE, alternative=two.sided)$p.value
 }
 
 hist(rv.t.pvalues)
 
 The histogram is this one:
 *http://tinyurl.com/histogram-rt-pvalues-pdf
 
 *
 *all the best
 idaios
 *
 
 
 On Wed, Jan 9, 2013 at 12:29 PM, Ted Harding ted.hard...@wlandres.netwrote:
 
 On 09-Jan-2013 08:50:46 Pavlos Pavlidis wrote:
  Dear all,
  I observer a strange behavior of the pvalues of the t-test under
  the null hypothesis. Specifically, I obtain 2 samples of 3
  individuals each from a normal distribution of mean 0 and variance 1.
  Then, I calculate the pvalue using the t-test (var.equal=TRUE,
  samples are independent). When I make a histogram of pvalues
  I see that consistently the bin of the smallest pvalues has a
  lower frequency. Is this a known behavior of the t-test or it's
  a kind of bug/random number generation problem?
 
  kind regards,
  idaios

 Using the following code, I did not observe the behavious you describe.
 The histograms are consistent with a uniform distribution of the
 P-values, and the lowest bin for the P-values (when the code is
 run repeatedly) is not consistently lower (or higher, or anything
 else) than the other bins.

 ## My code:
 N - 1
 Ps - numeric(N)
 for(i in (1:N)){
   X1 - rnorm(3,0,1) ; X2 - rnorm(3,0,1)
   Ps[i] - t.test(X1,X2,var.equal=TRUE)$p.value
 }
 hist(Ps)
 

 If you would post the code you used, the reason why you are observing
 this may become more evident!

 Hoping this helps,
 Ted.

 -
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Date: 09-Jan-2013  Time: 10:29:21
 This message was sent by XFMail
 -

 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-
E-Mail: (Ted Harding) ted.hard...@wlandres.net
Date: 09-Jan-2013  Time: 14:51:04
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to count A, C, T, G in each row in a big data.frame?

2013-01-09 Thread Jessica Streicher

If test is the structure, will

test2-sapply(test[,-c(1:4)],function(x){table(t(x))})

to what you want?

On 09.01.2013, at 15:48, jim holtman wrote:

 forgot the data.  this will count the characters; you can add logic
 with 'table' to count groups
 
 
 x -
 structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565,
 Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238,
 Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581,
 Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373,
 Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685,
 Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L,
 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L,
 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L,
 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L,
 3619538L), strand = c(+, +, +, +, +, +, +, +,
 +, +, +, +, +, +, +, +, +, +, +, +),
X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT,
CC, GG, AG, AG, AG, TT, CC, AG, CC, AA,
GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC,
CC, TT, CC, GG, GG, AG, AG, TT, CC, AG,
CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC,
TT, CC, CC, TT, CC, GG, GG, GG, GG, GT,
CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT,
TT, CC, TT, CC, CC, TT, CC, GG, AA, AG,
GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG,
TC, TT, CC, TC, CC, CC, TT, CC, GG, AA,
GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA,
CC, TT, CC, CC, CC, CC, TT, CC, AG, GG,
AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA,
CC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GA, GA,
GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG,
CT, TT, CC, CT, CC, CC, TT, CC, GG, AG,
GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA,
CT, TT, CC, CT, CC, CC, TT, CC, GG, GA,
GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA,
TC, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG,
CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA,
CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GG, GG, GT, TC, AG, CC, AA, AA, AG), X2534 = c(GA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
AG, GG, TG, CC, AG, TC, AA, AA, AA), X2280 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
AG, GG, TT, CC, GG, CC, AA, AA, AG), X2316 = c(AA,
CC, TT, CC, CC, CC, CC, TT, CC, AG, AA,
AA, AG, TT, TC, GG, CT, AA, GG, GG), X2339 = c(AA,
CC, TT, CC, CC, CC, CC, TT, CC, GA, AA,
GG, GG, GT, CT, GG, TT, AA, AA, AG), X2331 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
GG, GG, TT, CC, GG, CC, AA, AA, AG), X2343 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
GG, GG, TT, CT, GG, CC, AA, AA, GA), X2352 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
GG, GG, TT, CC, GG, CC, AA, GA, AG), X2293 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
AA, GG, TT, TC, AA, CT, AA, AA, AA), X2338 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
AG, GG, TT, TC, AG, TC, AA, AA, GA), X2449 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, AG,
AA, GG, TT, CC, AA, TC, AA, AA, GA), X2296 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GA, GG,
AG, GG, TG, TC, AG, CC, AA, AA, AA), X2453 = c(AG,
TT, TT, CC, TT, CC, CC, TT, CC, AG, GG,
GA, GG, GT, CT, GA, CT, AA, AA, GA), X2460 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, AG, GG,
GG, GG, TG, CT, GG, CC, AA, AA, AA), X2474 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GA, AG,
AG, GG, TT, CC, AG, TC, AA, AA, GA), X2603 =

Re: [R] Need an advise for bayesian estimate

2013-01-09 Thread Jose Iparraguirre

Hi Kyong,

Even if it is not -as can be inferred from what you said- a homework or 
assignment related query (and the group has clear policy against such 
requests), the questions you posed have nontheless very little to do 
specifically with R. Instead, they are about statistics. In this respect, I 
would suggest then you read Data Analysis Using Regression and 
Multilevel/Hierarchical Models by Andrew Gelman and Jennifer Hill, which 
covers these issues.
Regards,

José



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of kyong park
Sent: 09 January 2013 14:38
To: R forum
Subject: [R] Need an advise for bayesian estimate

Hi R bayesians,

I need an advise how to resolve the two different estimates applying a
traditional glm (TG) and a bayes glm (BG), and different results depending
on the data formats of response data and  the prior specs using bayesglm in
 R. I'm not familiar with bayes estimate and my colleague asked me to look
into this  because the EPA from France reported a quite different estimates
for the follwoing ethylene data applying bayes method using MCSIM. As seen
below glm give same results regardless of response data format, i.e.,
two-column or binary formats, but bayesglm would give different results.
The result from French report is -91.78+8*lnC+6.055*lnT which lie
between prior.df=1 and 2 appling binary data format in R.

My question are as follows:

1. What is advantage using bayes estimate? Is it better for small samples?
2. How to resolve different estimates depending on the format of response
data, and the prior specs (Ex: prior.df)?
3.Should we use interval estimate rather than point estimate for BG?

Two-Column format:

cppm Tmin lnClnT Death   Number
1 18502407.522945.480645 5
2 16372407.400625.480644 5
3 14432407.274485.480641 5
4 10212406.928545.480640 5
5 4827  60 8.481984.094345 5
6 4202  60 8.343324.094341 5
7 4064  60 8.309924.094345 5
8 3966  60 8.285514.094342 5
9 3609  60 8.191194.094340 5

Binary data format:

   cppm Tmin lnC  lnT  resp
1  1850  2407.522945.48064 1
2  1850  2407.522945.48064 1
3  1850  2407.522945.48064 1
4  1850  2407.522945.48064 1
5  1850  2407.522945.48064 1
6  1637  2407.400625.48064 1
7  1637  2407.400625.48064 1
8  1637  2407.400625.48064 1
9  1637  2407.400625.48064 1
10 1637 2407.400625.48064 0
11 1443 2407.274485.480641
12 1443 2407.274485.480640
13 1443 2407.274485.480640
14 14432407.27448 5.480640
attach(ehtylene)
DL-cbind(Death,Alive=Number-Death)
Call:
glm(formula = DL ~ lnC + lnT, family = binomial(link = probit),
data = ethylene)
Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept) -145.156 43.668  -3.324 0.000887 ***
lnC   12.972  3.918   3.311 0.000931 ***
lnT9.122  2.736   3.335 0.000854 ***
Using binary data:
 Call:
glm(formula = resp ~ lnC + lnT, family = binomial(link = probit),
data = ethylene.mod)
Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept) -145.157 43.670  -3.324 0.000887 ***
lnC   12.972  3.918   3.311 0.000931 ***
lnT9.122  2.736   3.334 0.000855 ***
Using bayesglm with two-column data:

 summary(result3)
Call:
bayesglm(formula = DL ~ lnC + lnT, family = binomial(link = probit),
data = ethylene)
Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept) -134.971 17.490  -7.717 1.19e-14 ***
lnC   12.060  1.570   7.680 1.59e-14 ***
lnT8.485  1.095   7.751 9.11e-15 ***
Using bayesglm with binary data:
 Call:
bayesglm(formula = resp ~ lnC + lnT, family = binomial(link = probit),
data = ethylene.mod)
Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)  -98.477 26.919  -3.658 0.000254 ***
lnC8.792  2.423   3.628 0.000286 ***
lnT6.208  1.681   3.694 0.000221 ***

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Wrap Up and Run 10k is back! 

Also, new for 2013 – 2km intergenerational walks at selected venues. So recruit 
a buddy, dust

[R] Reminder: useR meetup group in Munich, Germany

2013-01-09 Thread Markus Schmidberger

Dear all,

this is a short reminder for the Meetup Munich useR group. Next week 
Wednesday (16th January 2013)  we have our first meeting with two talks about 
Reporting and Reproducible Research with R and some ideas for an R 
certification.

More information at http://www.meetup.com/munich-useR-group/events/63749502/

Meet you next week
Markus


On Dec 21, 2012, at 3:28 PM, Markus Schmidberger mschmidber...@freenet.de 
wrote:

 Dear all,
 
 I would like to invite Munich (Germany) area R users for our first meeting: 
 16th January 2013. The group is aimed to bring together practitioners (from 
 industry and academia) in order to exchange knowledge and experience in 
 solving data analysis  statistical problems by using R. More information 
 about the group at: 
 http://www.meetup.com/munich-useR-group/
 
 1. Meeting: http://www.meetup.com/munich-useR-group/events/63749502/
 
 Merry Christmas, happy New Year and see you in 2013
 Markus
 
 
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Basic loop programming

2013-01-09 Thread Paolo Donatelli

Hi all,

newbie question: I am trying to set up a very simple loop without succeeding.

Let's say I have monthly observation of two variables for a year
- Sales_2012_01, Sales_2012_02, Sales_2012_03,   (total sales
for jan 2012,feb 2012, etc.)
- Customers_2012_01, Customers_2012_02,   (total number of
customers for jan 2012, etc.)

and I want to create new monthly variables in order to compute
revenues per customers:

Av_revenue_2012_01 = Sales_2012_01 / Customers_2012_01
Av_revenue_2012_02 = Sales_2012_02 / Customers_2012_02
...

how can I proceed?


In other programming language I used just to write something like
for (i in list(01,02, ..., 12) {
Av_revenue_2012_'i' = Sales_2012_'i' / Customers_2012_'i'
}

but in R it seems not to work like that. Further, and correct me if I
am wrong, I cannot use simple (i in 1:12) since I have a 0 digit in
front of the single-digit months.

thanks in advance for your help

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Basic loop programming

2013-01-09 Thread PIKAL Petr

Hi

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Paolo Donatelli
 Sent: Wednesday, January 09, 2013 5:03 PM
 To: r-help@r-project.org
 Subject: [R] Basic loop programming

 Hi all,

 newbie question: I am trying to set up a very simple loop without
 succeeding.

 Let's say I have monthly observation of two variables for a year
 - Sales_2012_01, Sales_2012_02, Sales_2012_03,   (total sales
 for jan 2012,feb 2012, etc.)
 - Customers_2012_01, Customers_2012_02,   (total number of
 customers for jan 2012, etc.)

 and I want to create new monthly variables in order to compute revenues
 per customers:

 Av_revenue_2012_01 = Sales_2012_01 / Customers_2012_01
 Av_revenue_2012_02 = Sales_2012_02 / Customers_2012_02 ...

 how can I proceed?

 In other programming language I used just to write something like for
 (i in list(01,02, ..., 12) { Av_revenue_2012_'i' = Sales_2012_'i'
 / Customers_2012_'i'
 }

 but in R it seems not to work like that. Further, and correct me if I
 am wrong, I cannot use simple (i in 1:12) since I have a 0 digit in
 front of the single-digit months.

Hm. Why do you want to do it in R if you prefer other languages? Did you find R 
by accident or are you prepared to use it in future? If you want to use it, it 
is just right time to learn some basics.

Anyway, if you have 2 vectors, call them sales and customers, you can just do

av.revenue - sales/customers

Until you do not provide more info about your data e.g. by at least some of

?head, ?str or preferably ?dput

you hardly get some suitable advice.

Regards
Petr

 thanks in advance for your help

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Basic loop programming

2013-01-09 Thread Jose Iparraguirre

Hi Paolo,

You say you have monthly observations of two variables, say Sales and 
Customers. 

Then, what you should have is something like this:

Year Month   Sales  Customer
2012Jan ss_12.1 cc_12.1
2012Feb ss_12.2 cc_12.2
... ...   ...   ...
2013Jan ss_13.1 cc_13.1 
2013Feb ss_13.2 cc_13.2
... ...   ... ...

where ss_YY.M and cc_ YY.M are numerical values (the total sales and number of 
customers for year YY and month M, respectively). For example,

Year Month  Sales   Customer
2012Jan 100 25
2012Feb 120 30
... ... ... ...

If this is the case, and you have the data in a data frame (say df), all you 
need to do to create a new column in your data frame with the average revenue 
is:

 df$Av_revenue - df$Sales/ df$Customer
 
You can omit df$ from the instruction above if you want to create the object 
Av_revenue but not include it in the data frame.

I am not getting it right, would you please send us the first three or four 
lines of your data?

Regards,

José


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Paolo Donatelli
Sent: 09 January 2013 16:03
To: r-help@r-project.org
Subject: [R] Basic loop programming

Hi all,

newbie question: I am trying to set up a very simple loop without succeeding.

Let's say I have monthly observation of two variables for a year
- Sales_2012_01, Sales_2012_02, Sales_2012_03,   (total sales
for jan 2012,feb 2012, etc.)
- Customers_2012_01, Customers_2012_02,   (total number of
customers for jan 2012, etc.)

and I want to create new monthly variables in order to compute
revenues per customers:

Av_revenue_2012_01 = Sales_2012_01 / Customers_2012_01
Av_revenue_2012_02 = Sales_2012_02 / Customers_2012_02
...

how can I proceed?


In other programming language I used just to write something like
for (i in list(01,02, ..., 12) {
Av_revenue_2012_'i' = Sales_2012_'i' / Customers_2012_'i'
}

but in R it seems not to work like that. Further, and correct me if I
am wrong, I cannot use simple (i in 1:12) since I have a 0 digit in
front of the single-digit months.

thanks in advance for your help

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Wrap Up and Run 10k is back! 

Also, new for 2013 – 2km intergenerational walks at selected venues. So recruit 
a buddy, dust off the trainers and beat the winter blues by 
signing up now:

http://www.ageuk.org.uk/10k

 Milton Keynes | Oxford | Sheffield | Crystal Palace | Exeter | 
Harewood House, Leeds | 
 Tatton Park, Cheshire | Southampton | Coventry



Age UK Improving later life

http://www.ageuk.org.uk


 

---
Age UK is a registered charity and company limited by guarantee, (registered 
charity number 1128267, registered company number 6825798). 
Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA.

For the purposes of promoting Age UK Insurance, Age UK is an Appointed 
Representative of Age UK Enterprises Limited, Age UK is an Introducer 
Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth 
Access for the purposes of introducing potential annuity and health 
cash plans customers respectively.  Age UK Enterprises Limited, JLT Benefit 
Solutions Limited and Simplyhealth Access are all authorised and 
regulated by the Financial Services Authority. 
--

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are 
addressed. If you receive a message in error, please advise the sender and 
delete immediately.

Except where this email is sent in the usual course of our business, any 
opinions expressed in this email are those of the author and do not 
necessarily reflect the opinions of Age UK or its subsidiaries and associated 
companies. Age UK monitors all e-mail transmissions passing 
through its network and may block or modify mails which are deemed to be 
unsuitable.

Age Concern England (charity number 261794) and Help the Aged (charity number 
272786) and their trading and other associated companies merged 
on 1st April 2009.  Together they have formed the Age UK Group, dedicated to 
improving the lives of people in later life.  The three national 
Age Concerns in Scotland, Northern Ireland and Wales have also merged with Help 
the Aged in these nations to form three registered charities: 
Age Scotland, Age NI, Age Cymru.










__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

Re: [R] rJava Error

2013-01-09 Thread Simon Urbanek

On Jun 27, 2012, at 12:16 AM, fabin.ittiachan wrote:

 Hi,
 
 I'm receiving an error when I am trying to install rJava. I have posted the
 error below.
 

Your R was not compiled with --enable-R-shlib so you can't use JRI (see 
http://rforge.net/rJava). You can either disable JRI (if you don't need it) or 
have to use R compiled with shlib support.

NB: stats-rosuda-devel mailing list is the proper list for rJava questions.

Cheers,
Simon


 RHive_0.0-6.tar.gz  rJava_0.9-3.tar.gz  RJDBC_0.2-0.tar.gz 
 Rserve_0.6-8.tar.gz
 [root@localhost Package]# R CMD INSTALL rJava_0.9-3.tar.gz 
 * installing to library ‘/usr/local/lib64/R/library’
 * installing *source* package ‘rJava’ ...
 ** package ‘rJava’ successfully unpacked and MD5 sums checked
 checking for gcc... gcc -std=gnu99
 checking for C compiler default output file name... a.out
 checking whether the C compiler works... yes
 checking whether we are cross compiling... no
 checking for suffix of executables... 
 checking for suffix of object files... o
 checking whether we are using the GNU C compiler... yes
 checking whether gcc -std=gnu99 accepts -g... yes
 checking for gcc -std=gnu99 option to accept ISO C89... none needed
 checking how to run the C preprocessor... gcc -std=gnu99 -E
 checking for grep that handles long lines and -e... /bin/grep
 checking for egrep... /bin/grep -E
 checking for ANSI C header files... yes
 checking for sys/wait.h that is POSIX.1 compatible... yes
 checking for sys/types.h... yes
 checking for sys/stat.h... yes
 checking for stdlib.h... yes
 checking for string.h... yes
 checking for memory.h... yes
 checking for strings.h... yes
 checking for inttypes.h... yes
 checking for stdint.h... yes
 checking for unistd.h... yes
 checking for string.h... (cached) yes
 checking sys/time.h usability... yes
 checking sys/time.h presence... yes
 checking for sys/time.h... yes
 checking for unistd.h... (cached) yes
 checking for an ANSI C-conforming const... yes
 checking whether time.h and sys/time.h may both be included... yes
 configure: checking whether gcc -std=gnu99 supports static inline...
 yes
 checking whether setjmp.h is POSIX.1 compatible... yes
 checking whether sigsetjmp is declared... yes
 checking whether siglongjmp is declared... yes
 checking Java support in R... present:
 interpreter : '/usr/local/Java/jre/bin/java'
 archiver: '/usr/local/Java/jre/../bin/jar'
 compiler: '/usr/local/Java/jre/../bin/javac'
 header prep.: '/usr/local/Java/jre/../bin/javah'
 cpp flags   : '-I/usr/local/Java/jre/../include
 -I/usr/local/Java/jre/../include/linux'
 java libs   : '-L/usr/local/Java/jre/lib/amd64
 -L/usr/local/Java/jre/lib/amd64/server -ljvm'
 checking whether JNI programs can be compiled... yes
 checking JNI data types... ok
 checking whether JRI should be compiled (autodetect)... yes
 checking whether debugging output should be enabled... no
 checking whether memory profiling is desired... no
 checking whether threads support is requested... no
 checking whether callbacks support is requested... no
 checking whether JNI cache support is requested... no
 checking whether JRI is requested... yes
 configure: creating ./config.status
 config.status: creating src/Makevars
 config.status: creating R/zzz.R
 config.status: creating src/config.h
 === configuring in jri (/tmp/RtmpUyYk4N/R.INSTALL6645c827c46/rJava/jri)
 configure: running /bin/sh ./configure '--prefix=/usr/local' 
 --cache-file=/dev/null --srcdir=.
 checking build system type... x86_64-unknown-linux-gnu
 checking host system type... x86_64-unknown-linux-gnu
 checking for gcc... gcc -std=gnu99
 checking for C compiler default output file name... a.out
 checking whether the C compiler works... yes
 checking whether we are cross compiling... no
 checking for suffix of executables... 
 checking for suffix of object files... o
 checking whether we are using the GNU C compiler... yes
 checking whether gcc -std=gnu99 accepts -g... yes
 checking for gcc -std=gnu99 option to accept ISO C89... none needed
 checking how to run the C preprocessor... gcc -std=gnu99 -E
 checking for grep that handles long lines and -e... /bin/grep
 checking for egrep... /bin/grep -E
 checking for ANSI C header files... yes
 checking whether Java interpreter works... checking whether JNI programs can
 be compiled... yes
 checking whether JNI programs can be run... yes
 checking JNI data types... ok
 checking whether Rinterface.h exports R_CStackXXX variables... yes
 checking whether Rinterface.h exports R_SignalHandlers... yes
 configure: creating ./config.status
 config.status: creating src/Makefile
 config.status: creating Makefile
 config.status: creating run
 config.status: creating src/config.h
 ** libs
 gcc -std=gnu99 -I/usr/local/lib64/R/include -DNDEBUG -I.
 -I/usr/local/Java/jre/../include -I/usr/local/Java/jre/../include/linux
 -I/usr/local/include-fpic  -g -O2  -c Rglue.c -o Rglue.o
 gcc -std=gnu99 -I/usr/local/lib64/R/include -DNDEBUG -I.
 -I/usr/local/Java/jre/../include

Re: [R] Java, rJava, and Windows x64

2013-01-09 Thread Simon Urbanek


On Oct 29, 2012, at 11:17 AM, Robert Baer wrote:

 When running [1] R version 2.15.1 (2012-06-22) x86_64-pc-mingw32, rJava 
 fails. I have installed both the 32-bit and 64-bit versions of Java 7 update 
 9.
 
  library(rJava)
 Error : .onLoad failed in loadNamespace() for 'rJava', details:
 call: stop(No CurrentVersion entry in ', key, '! Try re-installing Java 
 and make sure R and Java have matching architectures.)
 error: object 'key' not found
 Error: package/namespace load failed for ‘rJava’
 
 
 It appears that rJava was not seeing the x64 Java. For clarity, I installed 
 the 32-bit java library second, and I imagined this might be the problem. The 
 Java installer told me that it was already present, and the x64 library 
 appeared to be working with the 64-bit IE9 browser
 
 Indeed, reinstalling Java x64, the rJava package iloaded fine with the 
 library(rJava) command in 64-bit R. rJava could STILL be loaded with 
 library(rJava) within x86 R.
 
 My question is, should the order of Java installation affect the ability of 
 rJava to load under 64-bit R? Are there environmental variables or registry 
 settings that should be checked in such cases or is it literally necessary to 
 do a complete reinstall?
 

The registries are completely separate for 32-bit and 64-bit so installing 
32-bit Java doesn't affect 64-bit R and vice-versa. rJava is simply checking 
the registry that is has access to and it is the one corresponding to the R 
process (so 32-bit R will check 32-bit registry and 64-bit R will check the 
64-bit registry). It is looking for either of Software\JavaSoft\Java Runtime 
Environment or Software\JavaSoft\Java Development Kit registry tree.

Cheers,
Simon

PS: Please uses stats-rosuda-devel mailing list for rJava questions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] weighted factor analysis

2013-01-09 Thread Virgile Capo-Chichi

hello there,
I am trying to use a weight variable in a factor analysis but apparently
the factanal command does not have a weight option. Any way to this? Thanks
for your suggestions, V

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] select partial name and full name columns

2013-01-09 Thread Irucka Embry

Hi Arun, thank-you for your suggestion.

I made a mistake previously when I suggested that there was a prefix
in front of 00060_3 possibly suggesting that it was a string of
characters rather than numbers. The prefix in front of 00060_3
is actually two numbers, see the examples below:

01_00060_3 01_00060_3_cd 15_00060_3 15_00060_3_cd
02_00060_3 02_00060_3_cd

How can the following code be modified to reflect the numerical rather
than character prefix? 

dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])]

Thank-you.

Irucka Embry


-Original Message- 
From: arun [smartpink...@yahoo.com]
Sent: 1/9/2013 7:13:05 AM
To: iruc...@mail2world.com
Cc: r-help@r-project.org
Subject: Re: [R] select partial name and full name columns



Hi,

May be this is creating the problem:

set.seed(15)
dat1-data.frame(A_00060_3=sample(1:10,5,replace=TRUE),B_00060_
3_cd=sample(20:30,5,replace=TRUE),C_00060_3=sample(1:15,5,replace=TR
UE),D_00060=sample(1:8,5,replace=TRUE),datetime=as.POSIXct(paste(rep(6/
3/2011,5),c(0:00,0:30,0:35,0:40,0:45)),format=%m/%d/%Y
%H:%M))
 dat1[,c(datetime,grep(00060_3,colnames(dat1)))]
#Error in `[.data.frame`(dat1, , c(datetime, grep(00060_3,
colnames(dat1 : 
  #undefined columns selected
dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])]
# datetime A_00060_3 B_00060_3_cd C_00060_3
#1 2011-06-03 00:00:00 7   30 2
#2 2011-06-03 00:30:00 2   2810
#3 2011-06-03 00:35:0010   22 8
#4 2011-06-03 00:40:00 7   2711
#5 2011-06-03 00:45:00 4   2913
A.K.



- Original Message -
From: Irucka Embry iruc...@mail2world.com
To: r-help@r-project.org
Cc: 
Sent: Wednesday, January 9, 2013 5:44 AM
Subject: [R] select partial name and full name columns

Hi, I have the following function:

getDataFromDVFileCustom - function (file, hasHeader = TRUE, separator
=
\t) 
{
DVdatatmp - as.matrix(read.table(file, sep = \t, fill = TRUE,
comment.char = #, as.is = TRUE, stringsAsFactors = FALSE, na.strings
=
NA))
DVdatatmper - as.matrix(DVdatatmp[ , c(datetime,
grep(^_00060_3, colnames(DVdatatmp)))])
retval - as.data.frame(DVdatatmper, colClasses = c(character), fill
=
TRUE, comment.char = #, stringsAsFactors = FALSE)
if (ncol(retval) == 2) {
names(retval) - c(dateTime, value)
}
else if (ncol(retval) == 3) {
names(retval) - c(dateTime, value, code)
}
if (dateFormatCheck(retval$dateTime)) {
retval$dateTime - as.Date(retval$dateTime)
}
else {
retval$dateTime - as.Date(retval$dateTime, format = %m/%d/%Y)
}
retval$value - as.numeric(retval$value)
return(retval)
}

The function gives me this error:
getDataFromDVFileCustom(file)
Error in as.matrix(DVdatatmp[, c(datetime, grep(^_00060_3,
colnames(DVdatatmp)))]) : 
subscript out of bounds

I am trying to only select 3 columns (datetime and then two partial
name
columns that end in 00060_3 and 00060_3_cd. Each file that I
will be reading into the function has a different number of columns and
a different prefix in front of 00060_3 and 00060_3_cd. I have
searched online and tried those possible solutions, but they did not
work for my function and data.

What is the best way to select those 3 columns only?

Thank-you.

Irucka Embry 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code. 


span id=m2wTlpfont face=Arial, Helvetica, sans-serif size=2 
style=font-size:13.5px___BRGet
 the Free email that has everyone talking at a href=http://www.mail2world.com 
target=newhttp://www.mail2world.com/abr  font color=#99Unlimited 
Email Storage #150; POP3 #150; Calendar #150; SMS #150; Translator #150; 
Much More!/font/font/span
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to count A, C, T, G in each row in a big data.frame?

2013-01-09 Thread Jessica Streicher

Sorry, you wanted rows, i wrote for columns

#rows would be:
test2-apply(test[,-c(1:4)],1,function(x){table(t(x))})

#find single values in a row
sapply(test2,function(row){
allVars-paste(names(row),collapse=)
u - unique(strsplit(allVars,)[[1]])
parts-sapply(names(row),function(x){u%in%strsplit(x,)[[1]]})
mat-parts%*%row
rownames(mat)-u
mat
})

though i guess lists aren't ideal, but theres another answer as well i see.

On 09.01.2013, at 15:23, Yao He wrote:

 Dear All
 
 I have a data.frame like that:
 structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565,
 Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238,
 Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581,
 Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373,
 Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685,
 Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L,
 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L,
 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L,
 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L,
 3619538L), strand = c(+, +, +, +, +, +, +, +,
 +, +, +, +, +, +, +, +, +, +, +, +),
X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT,
CC, GG, AG, AG, AG, TT, CC, AG, CC, AA,
GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC,
CC, TT, CC, GG, GG, AG, AG, TT, CC, AG,
CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC,
TT, CC, CC, TT, CC, GG, GG, GG, GG, GT,
CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT,
TT, CC, TT, CC, CC, TT, CC, GG, AA, AG,
GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG,
TC, TT, CC, TC, CC, CC, TT, CC, GG, AA,
GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA,
CC, TT, CC, CC, CC, CC, TT, CC, AG, GG,
AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA,
CC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GA, GA,
GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG,
CT, TT, CC, CT, CC, CC, TT, CC, GG, AG,
GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA,
CT, TT, CC, CT, CC, CC, TT, CC, GG, GA,
GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA,
TC, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG,
CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA,
CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GG, GG, GT, TC, AG, CC, AA, AA, AG), X2534 = c(GA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
AG, GG, TG, CC, AG, TC, AA, AA, AA), X2280 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
AG, GG, TT, CC, GG, CC, AA, AA, AG), X2316 = c(AA,
CC, TT, CC, CC, CC, CC, TT, CC, AG, AA,
AA, AG, TT, TC, GG, CT, AA, GG, GG), X2339 = c(AA,
CC, TT, CC, CC, CC, CC, TT, CC, GA, AA,
GG, GG, GT, CT, GG, TT, AA, AA, AG), X2331 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
GG, GG, TT, CC, GG, CC, AA, AA, AG), X2343 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
GG, GG, TT, CT, GG, CC, AA, AA, GA), X2352 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
GG, GG, TT, CC, GG, CC, AA, GA, AG), X2293 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
AA, GG, TT, TC, AA, CT, AA, AA, AA), X2338 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
AG, GG, TT, TC, AG, TC, AA, AA, GA), X2449 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, AG,
AA, GG, TT, CC, AA, TC, AA, AA, GA), X2296 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GA, GG,
AG, GG, TG, TC, AG, CC, AA, AA, AA), X2453 = c(AG,
TT, TT, CC, TT,

Re: [R] Integrating Java, C++ and R

2013-01-09 Thread Simon Urbanek


On Jan 4, 2013, at 11:41 AM, Dirk Eddelbuettel wrote:

 
 On 4 January 2013 at 16:57, Suzen, Mehmet wrote:
 | On 4 January 2013 11:36, Royden Fernandes roydens...@gmail.com wrote:
 |  Hi,
 | 
 |  I am able to integrate C++ and R through RInside library. However when I
 | 
 | Questions regarding RInside should go to the rcpp-devel mailing list.
 | http://lists.r-forge.r-project.org/mailman/listinfo/rcpp-devel
 
 Very good.
 
 But to the OP's defence -- he posted there. But as he himself stated (in what
 you still quote here): The RInside integration of R and C++ works for him,
 but Java created trouble.  So I recommended r-devel (not r-help) to seek help
 from someone with better Java understanding.
 

Agreed. It will require combined knowledge of Rcpp and Java, though. What is 
testR()? Note that R requires a set of environment variables to be setup 
correctly in order to run - so did you start your program using R CMD java ...? 
Also you will likely need to make sure that you disable stack limit checks 
since java may change the stack depending on the thread. Another alternative 
would be to use JRI as a starting point since it solves all the R/Java issues 
and then call C++ code from there.

Cheers,
Simon


 Dirk
 
 -- 
 Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] select partial name and full name columns

2013-01-09 Thread Irucka Embry

Hi Arun, thanks again for your assistance.

Previously I did not read the files with the headers so I could not
search for those prefixed names. I corrected my mistake and the code
that you suggested does work.

Irucka

-Original Message- 
From: arun [smartpink...@yahoo.com]
Sent: 1/9/2013 11:09:13 AM
To: iruc...@mail2world.com
Cc: r-help@r-project.org
Subject: Re: [R] select partial name and full name columns



Hi,
You can use the same code:
set.seed(15)
 dat1-data.frame(sample(1:10,5,replace=TRUE),sample(20:30,5,replace=TR
UE),sample(1:15,5,replace=TRUE),sample(1:8,5,replace=TRUE),datetime=as.P
OSIXct(paste(rep(6/3/2011,5),c(0:00,0:30,0:35,0:40,0:45)),fo
rmat=%m/%d/%Y
%H:%M))

 colnames(dat1)[1:4]-c(01_00060_3,01_60_3_cd,15_6
0_3,15_00060)


dat1
#  01_00060_3 01_60_3_cd 15_60_3 15_00060
#1  7 30   27
#2  2 28  104
#3 10 22   88
#4  7 27  112
#5  4 29  137
  #   datetime
#1 2011-06-03 00:00:00
#2 2011-06-03 00:30:00
#3 2011-06-03 00:35:00
#4 2011-06-03 00:40:00
#5 2011-06-03 00:45:00


dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])]
# datetime 01_00060_3 01_60_3_cd
15_60_3
#1 2011-06-03 00:00:00  7 30  
2
#2 2011-06-03 00:30:00  2 28 
10
#3 2011-06-03 00:35:00 10 22  
8
#4 2011-06-03 00:40:00  7 27 
11
#5 2011-06-03 00:45:00  4 29 
13


A.K.

From: Irucka Embry iruc...@mail2world.com
To: smartpink...@yahoo.com 
Cc: r-help@r-project.org 
Sent: Wednesday, January 9, 2013 11:36 AM
Subject: Re: [R] select partial name and full name columns


Hi Arun, thank-you for your suggestion.

I made a mistake previously when I suggested that there was a prefix
in front
of 00060_3 possibly suggesting that it was a string of characters
rather
than numbers. The prefix in front of 00060_3 is actually two
numbers,
see the examples below:

01_00060_3 01_00060_3_cd 15_00060_3 15_00060_3_cd
02_00060_3
02_00060_3_cd

How can the following code be modified to reflect the numerical rather
than
character prefix? 

dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])]

Thank-you.

Irucka Embry


-Original Message- 
From: arun [smartpink...@yahoo.com]
Sent: 1/9/2013 7:13:05 AM
To: iruc...@mail2world.com
Cc: r-help@r-project.org
Subject: Re: [R] select partial name and full name columns



Hi,

May be this is creating the problem:

set.seed(15)
dat1-data.frame(A_00060_3=sample(1:10,5,replace=TRUE),B_00060_000
03_cd=sample(20:30,5,replace=TRUE),C_00060_3=sample(1:15,5,replace=T
RUE),D_00060=sample(1:8,5,replace=TRUE),datetime=as..POSIXct(paste(rep(
6/3/2011,5),c(0:00,0:30,0:35,0:40,0:45)),format=%m/%d/%Y
%H:%M))
 dat1[,c(datetime,grep(00060_3,colnames(dat1)))]
#Error in `[.data.frame`(dat1, , c(datetime, grep(00060_3,
colnames(dat1 : 
  #undefined columns selected
dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])
]
# datetime A_00060_3 B_00060_3_cd C_00060_3
#1 2011-06-03 00:00:00 7   30 2
#2 2011-06-03 00:30:00 2   2810
#3 2011-06-03 00:35:0010   22 8
#4 2011-06-03 00:40:00 7   2711
#5 2011-06-03 00:45:00 4   2913
A.K.



- Original Message -
From: Irucka Embry iruc...@mail2world.com
To: r-help@r-project.org
Cc: 
Sent: Wednesday, January 9, 2013 5:44 AM
Subject: [R] select partial name and full name columns

Hi, I have the following function:

getDataFromDVFileCustom - function (file, hasHeader = TRUE, separator
=
\t) 
{
DVdatatmp - as.matrix(read.table(file, sep = \t, fill = TRUE,
comment.char = #, as.is = TRUE, stringsAsFactors = FALSE, na.strings
=
NA))
DVdatatmper - as.matrix(DVdatatmp[ , c(datetime,
grep(^_00060_3, colnames(DVdatatmp)))])
retval - as.data.frame(DVdatatmper, colClasses = c(character), fill
=
TRUE, comment.char = #, stringsAsFactors = FALSE)
if (ncol(retval) == 2) {
names(retval) - c(dateTime, value)
}
else if (ncol(retval) == 3) {
names(retval) - c(dateTime, value, code)
}
if (dateFormatCheck(retval$dateTime)) {
retval$dateTime - as.Date(retval$dateTime)
}
else {
retval$dateTime - as.Date(retval$dateTime, format = %m/%d/%Y)
}
retval$value - as.numeric(retval$value)
return(retval)
}

The function gives me this error:
getDataFromDVFileCustom(file)
Error in as.matrix(DVdatatmp[, c(datetime, grep(^_00060_3,
colnames(DVdatatmp)))]) :

[R] R encrypt/decrypt

2013-01-09 Thread Ramiro Barrantes

Hello,

I am working on a web system (php) that uses R in the backend, and we need some 
basic fast encryption/decryption for the underlying mysql database that can be 
used by both R AND php.  It does not need to be top-of-the-line, but just 
provide some basic level of fast encryption/decryption.

Any suggestions?

Thank you,

Ramiro

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] select partial name and full name columns

2013-01-09 Thread arun



Hi,

May be this is creating the problem:

set.seed(15)
dat1-data.frame(A_00060_3=sample(1:10,5,replace=TRUE),B_00060_3_cd=sample(20:30,5,replace=TRUE),C_00060_3=sample(1:15,5,replace=TRUE),D_00060=sample(1:8,5,replace=TRUE),datetime=as.POSIXct(paste(rep(6/3/2011,5),c(0:00,0:30,0:35,0:40,0:45)),format=%m/%d/%Y
 %H:%M))
 dat1[,c(datetime,grep(00060_3,colnames(dat1)))]
#Error in `[.data.frame`(dat1, , c(datetime, grep(00060_3, 
colnames(dat1 : 
  #undefined columns selected
dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])]
# datetime A_00060_3 B_00060_3_cd C_00060_3
#1 2011-06-03 00:00:00 7   30 2
#2 2011-06-03 00:30:00 2   28    10
#3 2011-06-03 00:35:00    10   22 8
#4 2011-06-03 00:40:00 7   27    11
#5 2011-06-03 00:45:00 4   29    13
A.K.



- Original Message -
From: Irucka Embry iruc...@mail2world.com
To: r-help@r-project.org
Cc: 
Sent: Wednesday, January 9, 2013 5:44 AM
Subject: [R] select partial name and full name columns

Hi, I have the following function:

getDataFromDVFileCustom - function (file, hasHeader = TRUE, separator =
\t) 
{
DVdatatmp - as.matrix(read.table(file, sep = \t, fill = TRUE,
comment.char = #, as.is = TRUE, stringsAsFactors = FALSE, na.strings =
NA))
DVdatatmper - as.matrix(DVdatatmp[ , c(datetime,
grep(^_00060_3, colnames(DVdatatmp)))])
retval - as.data.frame(DVdatatmper, colClasses = c(character), fill =
TRUE, comment.char = #, stringsAsFactors = FALSE)
if (ncol(retval) == 2) {
names(retval) - c(dateTime, value)
}
else if (ncol(retval) == 3) {
names(retval) - c(dateTime, value, code)
}
if (dateFormatCheck(retval$dateTime)) {
retval$dateTime - as.Date(retval$dateTime)
}
else {
retval$dateTime - as.Date(retval$dateTime, format = %m/%d/%Y)
}
retval$value - as.numeric(retval$value)
return(retval)
}

The function gives me this error:
getDataFromDVFileCustom(file)
Error in as.matrix(DVdatatmp[, c(datetime, grep(^_00060_3,
colnames(DVdatatmp)))]) : 
subscript out of bounds

I am trying to only select 3 columns (datetime and then two partial name
columns that end in 00060_3 and 00060_3_cd. Each file that I
will be reading into the function has a different number of columns and
a different prefix in front of 00060_3 and 00060_3_cd. I have
searched online and tried those possible solutions, but they did not
work for my function and data.

What is the best way to select those 3 columns only?

Thank-you.

Irucka Embry 


span id=m2wTlpfont face=Arial, Helvetica, sans-serif size=2 
style=font-size:13.5px___BRGet
 the Free email that has everyone talking at a href=http://www.mail2world.com 
target=newhttp://www.mail2world.com/abr  font color=#99Unlimited 
Email Storage – POP3 – Calendar – SMS – Translator – Much 
More!/font/font/span
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem adding curve/abline

2013-01-09 Thread Elisabeth Van Beveren

Hey,

I'm stuck on something I already did before (just a different kind of
database), and whatever I try, it doesn't work anymore. So thanks for your
help.

Here's how my data approximately looks like:

year   season replicate  sizefreq   weight

2000  summer  ch1 6 1 45

2000  summer  ch1 6.5  12   46

2000  summer  ch1 7 33   470

 

I have 2 years (2000 and 2001) and 2 seizons (winter and summer). I wanted
to plot weight~size, with 2 groups (year and seizon), so here's my shortened
script for that:

database$groups=paste(database$seizon,database2$year,sep= )

xyplot(database$weight~database2$size,  

groups=database$groups,  

 
par.settings=list(superpose.symbol=list(col=col.list,pch=c(21,16,21,16))),

auto.key=list(corner=c(0.1,0.9),lines=F,points=T))

Which works fine, the problem comes when I try to add 2 exponential curves
to the data (the 2 seizons). I tried this:

summ=subset(database,seizon==summer)

modsumm=nls(summ$weight~exp(a+b*summ$size), data=summ, start=list(a=0,b=0))

exposumm=curve(exp(0.05354+0.19872*x), from=0, to=22, add=T, lwd=1,
col=blue,lty=1)

After having to add plot.new() in the front, the line does or not show up,
or shows up but wrongly placed. I thought this might be because of the
subset, so I wanted to do something like this:

modsumm=nls(weight~exp(a+b* size), data=engsAGG2[seizon==summer],
start=list(a=0,b=0))

which returns: undefined columns selected

Thanks in advance for the reply.

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Basic loop programming

2013-01-09 Thread arun

HI,

If you have more than one observation per month, you could do this:
dat1-read.table(text=
Year Month    Sales    Customer
2011    Jan    150 35
2011    Jan    125 40
2011    Feb    130 45
2011    Feb    135 25
2012    Jan    100 25
2012    Jan    150 35
2012    Feb    118 45 
2012    Feb    120 30
2012    Mar    130 43
2012    Mar    125 35
,sep=,header=TRUE,stringsAsFactors=FALSE)
res-aggregate(.~Year+Month,data=dat1,mean)
 within(res,{Avrev-Sales/Customer})
#  Year Month Sales Customer    Avrev
#1 2011   Feb 132.5 35.0 3.785714
#2 2012   Feb 119.0 37.5 3.17
#3 2011   Jan 137.5 37.5 3.67
#4 2012   Jan 125.0 30.0 4.17
#5 2012   Mar 127.5 39.0 3.269231


A.K.



- Original Message -
From: Paolo Donatelli donatellipa...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Wednesday, January 9, 2013 11:02 AM
Subject: [R] Basic loop programming

Hi all,

newbie question: I am trying to set up a very simple loop without succeeding.

Let's say I have monthly observation of two variables for a year
- Sales_2012_01, Sales_2012_02, Sales_2012_03,       (total sales
for jan 2012,feb 2012, etc.)
- Customers_2012_01, Customers_2012_02,   (total number of
customers for jan 2012, etc.)

and I want to create new monthly variables in order to compute
revenues per customers:

Av_revenue_2012_01 = Sales_2012_01 / Customers_2012_01
Av_revenue_2012_02 = Sales_2012_02 / Customers_2012_02
...

how can I proceed?


In other programming language I used just to write something like
for (i in list(01,02, ..., 12) {
Av_revenue_2012_'i' = Sales_2012_'i' / Customers_2012_'i'
}

but in R it seems not to work like that. Further, and correct me if I
am wrong, I cannot use simple (i in 1:12) since I have a 0 digit in
front of the single-digit months.

thanks in advance for your help

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Applying a user-defined function

2013-01-09 Thread arun

Hi Pradip,

Another way to get the results would be:
 res-cbind(test1,do.call(data.frame,lapply(test1[,seq(1,6,2)],CutQuintiles)))
 colnames(res)[7:9]-paste(newcols_,colnames(res)[7:9],)
sapply(res,is.factor)
 #  ObtMj_P   ObtMj_SE   ExpPrevMed_P 
  #   FALSE  FALSE  FALSE 
   #  ExpPrevMed_SE   ParMon_P  ParMon_SE 
  #   FALSE  FALSE  FALSE 
# newcols_ ObtMj_P  newcols_ ExpPrevMed_P  newcols_ ParMon_P  
   #   TRUE   TRUE   TRUE 
Hope it helps.
A.K.




- Original Message -
From: Muhuri, Pradip (SAMHSA/CBHSQ) pradip.muh...@samhsa.hhs.gov
To: R help r-help@r-project.org
Cc: 
Sent: Tuesday, January 8, 2013 10:06 PM
Subject: Re: [R] Applying a user-defined function


Hello List,

Last time, Arun's following solution worked to create 3 new columns (1,3,5).  
Now how would I tweak this function to create corresponding (additional) 
columns (7,8,9) of mode factor (levels = 1,2,3,4,5)?

Thanks for your continued support.

Pradip

### cut and paste from the reproducible example
CutQuintiles - function( x) {
  cut (x,quantile (x, (0:5/5)),include.lowest=TRUE)
}

#apply the CutQuintile () on every odd-numbered columns of the test1 data 
frame
test1$newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles)

# name 3 new columns based on the odd-numbered columns
names(test1$newcols) - paste (names(test1 [, seq (1,6,2)]), _cat)




## Reproducible Example


test1 - read.table (text=
State,ObtMj_P,ObtMj_SE,ExpPrevMed_P,ExpPrevMed_SE,ParMon_P,ParMon_SE
Alabama,49.60,1.37,80.00,0.91,12.10,0.68
Alaska,55.00,1.41,81.80,1.08,12.40,0.90
Arizona,52.50,1.56,79.60,1.20,15.80,1.08
Arkansas,50.50,1.22,78.00,0.78,12.80,0.72
California,51.10,0.65,80.50,0.53,13.00,0.41
Colorado,55.10,1.26,81.70,1.03,12.10,0.72
Connecticut,56.30,1.28,85.00,0.93,14.60,0.77
Delaware,53.60,1.30,79.50,1.04,14.70,0.97
District of Columbia,53.50,1.22,76.20,1.03,14.30,1.13
Florida,52.70,0.67,78.90,0.52,14.10,0.45
Georgia,52.50,1.15,79.30,1.02,15.90,0.98
Hawaii,49.40,1.33,83.80,1.12,16.00,1.06
Idaho,48.30,1.23,82.40,0.99,11.90,0.74
Illinois,52.70,0.63,81.00,0.46,13.60,0.40
Indiana,49.60,1.16,80.90,0.91,12.60,0.82
Iowa,46.30,1.37,82.10,1.01,13.60,0.87
Kansas,44.30,1.43,79.20,0.98,12.90,0.79
Kentucky,52.90,1.37,78.70,1.05,14.60,0.98
Louisiana,49.70,1.23,76.80,1.06,14.50,0.76
Maine,55.60,1.44,82.90,0.93,16.70,0.83
Maryland,53.90,1.46,83.60,0.95,14.00,0.80
Massachusetts,55.40,1.41,81.00,1.15,14.70,0.80
Michigan,52.40,0.62,80.50,0.47,15.00,0.43
Minnesota,51.50,1.20,84.40,0.87,14.40,0.86
Mississippi,43.20,1.14,76.60,0.91,12.30,0.78
Missouri,48.70,1.20,80.30,0.90,13.70,0.12
Montana,56.40,1.16,83.70,0.95,12.10,0.68
Nebraska,45.70,1.51,83.40,0.95,12.40,0.90
Nevada,54.20,1.17,80.60,1.07,15.80,1.08
New Hampshire,56.10,1.30,83.30,0.93,12.80,0.72
New Jersey,53.20,1.45,83.70,0.95,13.00,0.41
New Mexico,57.60,1.34,78.90,1.03,12.10,0.72
New York,53.70,0.67,82.60,0.48,14.60,0.77
North Carolina,52.20,1.26,81.90,0.84,14.70,0.97
North Dakota,48.60,1.34,84.20,0.88,14.30,1.13
Ohio,50.90,0.61,82.70,0.49,14.10,0.45
Oklahoma,47.20,1.42,78.80,1.33,15.90,0.98
Oregon,54.00,1.35,80.60,1.14,16.00,1.06
Pennsylvania,53.00,0.63,79.90,0.47,11.90,0.74
Rhode Island,57.20,1.20,79.50,1.02,13.60,0.40
South Carolina,50.50,1.21,79.50,0.95,12.60,0.82
South Dakota,43.40,1.30,81.70,1.05,13.60,0.87
Tennessee,48.90,1.35,78.40,1.35,12.90,0.79
Texas,48.70,0.62,79.00,0.48,14.60,0.98
Utah,42.00,1.49,85.00,0.93,14.50,0.76
Vermont,58.70,1.24,83.70,0.84,16.70,0.83
Virginia,51.80,1.18,82.00,1.04,14.00,0.80
Washington,53.50,1.39,84.10,0.96,14.70,0.80
West Virginia,52.80,1.07,79.80,0.93,15.00,0.43
Wisconsin,49.90,1.50,83.50,1.02,14.40,0.86
Wyoming,49.20,1.29,82.00,0.85,12.30,0.78
, sep=,, row.names='State',  header=TRUE, as.is=TRUE)

# change names () to lower case

names (test1) - tolower (names (test1))

#Write a cut/quantile function to apply on different columns of the data frame

CutQuintiles - function( x) {
  cut (x,quantile (x, (0:5/5)),include.lowest=TRUE)
}

#apply the CutQuintile () on every odd-numbered columns of the test1 data 
frame
test1$newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles)

# name 3 new columns based on the odd-numbered columns
names(test1$newcols) - paste (names(test1 [, seq (1,6,2)]), _cat)

dim (test1)
options (width=100)
test1




    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented,

Re: [R] select partial name and full name columns

2013-01-09 Thread arun



Hi,
You can use the same code:
set.seed(15)
 
dat1-data.frame(sample(1:10,5,replace=TRUE),sample(20:30,5,replace=TRUE),sample(1:15,5,replace=TRUE),sample(1:8,5,replace=TRUE),datetime=as.POSIXct(paste(rep(6/3/2011,5),c(0:00,0:30,0:35,0:40,0:45)),format=%m/%d/%Y
 %H:%M))

 
colnames(dat1)[1:4]-c(01_00060_3,01_60_3_cd,15_60_3,15_00060)


dat1
#  01_00060_3 01_60_3_cd 15_60_3 15_00060
#1  7 30   2    7
#2  2 28  10    4
#3 10 22   8    8
#4  7 27  11    2
#5  4 29  13    7
  #   datetime
#1 2011-06-03 00:00:00
#2 2011-06-03 00:30:00
#3 2011-06-03 00:35:00
#4 2011-06-03 00:40:00
#5 2011-06-03 00:45:00


dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])]
# datetime 01_00060_3 01_60_3_cd 15_60_3
#1 2011-06-03 00:00:00  7 30   2
#2 2011-06-03 00:30:00  2 28  10
#3 2011-06-03 00:35:00 10 22   8
#4 2011-06-03 00:40:00  7 27  11
#5 2011-06-03 00:45:00  4 29  13


A.K.

From: Irucka Embry iruc...@mail2world.com
To: smartpink...@yahoo.com 
Cc: r-help@r-project.org 
Sent: Wednesday, January 9, 2013 11:36 AM
Subject: Re: [R] select partial name and full name columns


Hi Arun, thank-you for your suggestion.

I made a mistake previously when I suggested that there was a prefix in front 
of 00060_3 possibly suggesting that it was a string of characters rather 
than numbers. The prefix in front of 00060_3 is actually two numbers, 
see the examples below:

01_00060_3 01_00060_3_cd 15_00060_3 15_00060_3_cd 
02_00060_3 02_00060_3_cd

How can the following code be modified to reflect the numerical rather than 
character prefix? 

dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])]

Thank-you.

Irucka Embry


-Original Message- 
From: arun [smartpink...@yahoo.com]
Sent: 1/9/2013 7:13:05 AM
To: iruc...@mail2world.com
Cc: r-help@r-project.org
Subject: Re: [R] select partial name and full name columns



Hi,

May be this is creating the problem:

set.seed(15)
dat1-data.frame(A_00060_3=sample(1:10,5,replace=TRUE),B_00060_3_cd=sample(20:30,5,replace=TRUE),C_00060_3=sample(1:15,5,replace=TRUE),D_00060=sample(1:8,5,replace=TRUE),datetime=as.POSIXct(paste(rep(6/3/2011,5),c(0:00,0:30,0:35,0:40,0:45)),format=%m/%d/%Y
%H:%M))
 dat1[,c(datetime,grep(00060_3,colnames(dat1)))]
#Error in `[.data.frame`(dat1, , c(datetime, grep(00060_3,
colnames(dat1 : 
  #undefined columns selected
dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])]
# datetime A_00060_3 B_00060_3_cd C_00060_3
#1 2011-06-03 00:00:00 7   30 2
#2 2011-06-03 00:30:00 2   2810
#3 2011-06-03 00:35:0010   22 8
#4 2011-06-03 00:40:00 7   2711
#5 2011-06-03 00:45:00 4   2913
A.K.



- Original Message -
From: Irucka Embry iruc...@mail2world.com
To: r-help@r-project.org
Cc: 
Sent: Wednesday, January 9, 2013 5:44 AM
Subject: [R] select partial name and full name columns

Hi, I have the following function:

getDataFromDVFileCustom - function (file, hasHeader = TRUE, separator =
\t) 
{
DVdatatmp - as.matrix(read.table(file, sep = \t, fill = TRUE,
comment.char = #, as.is = TRUE, stringsAsFactors = FALSE, na.strings =
NA))
DVdatatmper - as.matrix(DVdatatmp[ , c(datetime,
grep(^_00060_3, colnames(DVdatatmp)))])
retval - as.data.frame(DVdatatmper, colClasses = c(character), fill =
TRUE, comment.char = #, stringsAsFactors = FALSE)
if (ncol(retval) == 2) {
names(retval) - c(dateTime, value)
}
else if (ncol(retval) == 3) {
names(retval) - c(dateTime, value, code)
}
if (dateFormatCheck(retval$dateTime)) {
retval$dateTime - as.Date(retval$dateTime)
}
else {
retval$dateTime - as.Date(retval$dateTime, format = %m/%d/%Y)
}
retval$value - as.numeric(retval$value)
return(retval)
}

The function gives me this error:
getDataFromDVFileCustom(file)
Error in as.matrix(DVdatatmp[, c(datetime, grep(^_00060_3,
colnames(DVdatatmp)))]) : 
subscript out of bounds

I am trying to only select 3 columns (datetime and then two partial name
columns that end in 00060_3 and 00060_3_cd. Each file that I
will be reading into the function has a different number of columns and
a different prefix in front of 00060_3 and 00060_3_cd. I have
searched online and tried those possible solutions, but they did not
work for my function and data.

What is the best way

Re: [R] R encrypt/decrypt

2013-01-09 Thread Alemu Tadesse

Dear All,

I am wondering if there is a script in R or Python that can convert shape
files to KML oKMZ files. I used a free online shp2kml.exe file my locations
all went to Africa. But, I know they are in the USA.

Thanks,

Alemu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [solved] t-test behavior given that the null hypothesis is true

2013-01-09 Thread Pavlos Pavlidis

Hi Ted,
yes this was the problem. Thank you very much.

best
idaios


On Wed, Jan 9, 2013 at 4:51 PM, Ted Harding ted.hard...@wlandres.netwrote:

 Ah! You have aqssigned a parameter equal.var=TRUE, and equal.var
 is not a listed paramater for t.test() -- see ?t.test :

   t.test(x, y = NULL,
 alternative = c(two.sided, less, greater),
 mu = 0, paired = FALSE, var.equal = FALSE,
 conf.level = 0.95, ...)

 Try it instead with var.equal=TRUE, i.e. in your code:
   for(i in 1:k){
 rv.t.pvalues[i] - t.test(rv[i, 1:(c/2)], rv[i, (c/2+1):c],
   ##equal.var=TRUE, alternative=two.sided)$p.value
 var.equal=TRUE, alternative=two.sided)$p.value
   }

 When I run your code with equal.var, I indeed repeatedly see
 the deficient bin for the lowest P-values that you observed.
 When I run your code with var.equal I do not see it.

 The explanation is that, since equal.var is not a recognised
 parameter for t.test(), it has assumed the default value FALSE
 for var.equal, and has therefore (since it is a 2-sample test)
 adopted the Welch/Satterthwaite procedure:

   var.equal: a logical variable indicating whether to treat
 the two variances as being equal. If 'TRUE' then the
 pooled variance is used to estimate the variance
 otherwise the Welch (or Satterthwaite) approximation
 to the degrees of freedom is used.

 This has the effect of somewhat adapting the test procedure to
 the data, so that extreme (i.e. small) values of P are even
 rarer than they should be.

 With best wishes,
 Ted.

 On 09-Jan-2013 13:24:59 Pavlos Pavlidis wrote:
  Hi Ted,
  thanks for the reply. I use a similar code which you can see below:
 
  k - 1
  c - 6
  rv - array(NA, dim=c(k, c) )
  for(i in 1:k){
rv[i,] - rnorm(c, mean=0, sd=1)
  }
 
  rv.t.pvalues - array(NA, k)
 
  for(i in 1:k){
rv.t.pvalues[i] - t.test(rv[i, 1:(c/2)], rv[i, (c/2+1):c],
  equal.var=TRUE, alternative=two.sided)$p.value
  }
 
  hist(rv.t.pvalues)
 
  The histogram is this one:
  *http://tinyurl.com/histogram-rt-pvalues-pdf
 
  *
  *all the best
  idaios
  *
 
 
  On Wed, Jan 9, 2013 at 12:29 PM, Ted Harding ted.hard...@wlandres.net
 wrote:
 
  On 09-Jan-2013 08:50:46 Pavlos Pavlidis wrote:
   Dear all,
   I observer a strange behavior of the pvalues of the t-test under
   the null hypothesis. Specifically, I obtain 2 samples of 3
   individuals each from a normal distribution of mean 0 and variance 1.
   Then, I calculate the pvalue using the t-test (var.equal=TRUE,
   samples are independent). When I make a histogram of pvalues
   I see that consistently the bin of the smallest pvalues has a
   lower frequency. Is this a known behavior of the t-test or it's
   a kind of bug/random number generation problem?
  
   kind regards,
   idaios
 
  Using the following code, I did not observe the behavious you describe.
  The histograms are consistent with a uniform distribution of the
  P-values, and the lowest bin for the P-values (when the code is
  run repeatedly) is not consistently lower (or higher, or anything
  else) than the other bins.
 
  ## My code:
  N - 1
  Ps - numeric(N)
  for(i in (1:N)){
X1 - rnorm(3,0,1) ; X2 - rnorm(3,0,1)
Ps[i] - t.test(X1,X2,var.equal=TRUE)$p.value
  }
  hist(Ps)
  
 
  If you would post the code you used, the reason why you are observing
  this may become more evident!
 
  Hoping this helps,
  Ted.
 
  -
  E-Mail: (Ted Harding) ted.hard...@wlandres.net
  Date: 09-Jan-2013  Time: 10:29:21
  This message was sent by XFMail
  -
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 -
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Date: 09-Jan-2013  Time: 14:51:04
 This message was sent by XFMail
 -


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using objects within functions in formulas

2013-01-09 Thread Rui Barradas


Hello,

Try the following. It uses argument 'data' to pass the data.frame w2. In 
the function below, I've changed the pastes to two lines of code because 
the first one changes the way the formula is put together.


test1 - function(x2, y2, w2) {
#print(str(w2))
p1 - paste((1|, names(w2), ), collapse= + , sep=)
p2 - paste(y2 ~ x2 + , p1)
form = as.formula(p2)
m1 = glmer(form, data = w2)
return(m1)
}


Hope this helps,

Rui Barradas
Em 09-01-2013 16:53, Aidan MacNamara escreveu:

Dear all,

I'm looking to create a formula within a function to pass to glmer()
and I'm having a problem that the following example will illustrate:

library(lme4)
y1 = rnorm(10)
x1 = data.frame(x11=rnorm(10), x12=rnorm(10), x13=rnorm(10))
x1 = data.matrix(x1)
w1 = data.frame(w11=sample(1:3,10, replace=TRUE), w12=sample(1:3,10,
replace=TRUE), w13=sample(1:3,10, replace=TRUE))

test1 - function(x2, y2, w2) {

print(str(w2))
form = as.formula(paste(y2 ~ x2 + ,paste((1|w2$, names(w2), ),
collapse= + , sep=)))
m1 = glmer(form)
return(m1)
}

model1 = test1(x2=x1, y2=y1, w2=w1)

As can be seen from the print statement within the function, the
object w2 is present and is a data frame. However, the following
error occurs:

Error in is.factor(x) : object 'w2' not found

This can be rectified by making 'w2' global - defining it outside the
function. I know there are issues with defining formulas and
environment but I'm not sure why this problem is specific to 'w2' and
not the other objects passed to the function.

Any help would be appreciated.

Aidan MacNamara
EMBL-EBI

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to label two figures in the same chunk independently with knitr

2013-01-09 Thread Yihui Xie

Hi Francesco,

This is an advanced topic in knitr; it is called a chunk hook:
http://yihui.name/knitr/hooks

Sorry for the confusion on the name par; you can call it anything, e.g. mypar

knit_hooks$set(mypar = function(before, options, envir) {
  if (before) par(mar = c(4, 4, .1, .1))
})
opts_chunk$set(mypar = TRUE)

For par(bg=rgb(runif(1), runif(1), runif(1))), it is nothing but a
line of normal R code.

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA


On Wed, Jan 9, 2013 at 2:59 AM, Francesco Sarracino
f.sarrac...@gmail.com wrote:
 Dear Yihui,

 thanks a lot for your kind reply. Your solution is very elegant and
 versatile.
 However, there is a point that is obscure to me and I didn't manage to fully
 understand them after looking at the Knitr manual and graphic manual.
 The issue concerns the hook:

 knit_hooks$set(par = function(before, options, envir) {
   if (before) par(mar = c(4, 4, .1, .1))
 })

 why do you set par as a function?
 moreover, below you write:

 par(bg=rgb(runif(1), runif(1), runif(1)))

 does this mean that before = rgb(runif(1)); options = runif(1) and envir =
 runif(1) ?

 and what does this produce? I don't understand what's going on, can you
 please help me or address me to some documentation?

 thanks in advance for your kind help,
 f.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using objects within functions in formulas

2013-01-09 Thread David Winsemius



On Jan 9, 2013, at 8:53 AM, Aidan MacNamara wrote:


Dear all,

I'm looking to create a formula within a function to pass to glmer()
and I'm having a problem that the following example will illustrate:

library(lme4)
y1 = rnorm(10)
x1 = data.frame(x11=rnorm(10), x12=rnorm(10), x13=rnorm(10))
x1 = data.matrix(x1)
w1 = data.frame(w11=sample(1:3,10, replace=TRUE), w12=sample(1:3,10,
replace=TRUE), w13=sample(1:3,10, replace=TRUE))

test1 - function(x2, y2, w2) {

print(str(w2))
form = as.formula(paste(y2 ~ x2 + ,paste((1|w2$, names(w2), ),
collapse= + , sep=)))
m1 = glmer(form)
return(m1)
}

model1 = test1(x2=x1, y2=y1, w2=w1)

As can be seen from the print statement within the function, the
object w2 is present and is a data frame. However, the following
error occurs:

Error in is.factor(x) : object 'w2' not found


Generally regression functions in R will be expecting to get one  
'data' argument and build formulas using column names from that object.


 test1 - function(x2, y2, w2) {
   w3 - cbind(w2, x2, x2)
print(str(w3))
form = as.formula(paste(y2 ~ x2 + ,paste((1|, names(w2), ),
collapse= + , sep=)))
m1 = glmer(form, data=w3); print(summary(m1))
return(m1)
}

model1 = test1(x2=x1, y2=y1, w2=w1)




This can be rectified by making 'w2' global - defining it outside the
function. I know there are issues with defining formulas and
environment but I'm not sure why this problem is specific to 'w2' and
not the other objects passed to the function.

Any help would be appreciated.

Aidan MacNamara
EMBL-EBI




David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] writing to .xlsx

2013-01-09 Thread Benjamin Caldwell

Dear r helpers;

I'm interested in reading from and writing to large .xlsx files fairly
regularly.  (Why, the naysayers may ask - and the answer is basically
colleagues and clients who prefer that format). I've tried out the
XLConnect and xlsx libraries, but the java implementation they use just
takes too much RAM for the files I'm working with.

gdata leverages perl and works really well for reading in those files, so
half the problem is solved for me! I don't see anything in the
documentation about writing .xlsx, though. Is anyone aware of any libraries
or clever solutions in R that would get the job done for me? I see a couple
packages on CPAN for writing an xlsx, so it's been done in perl; perhaps it
would be easy to run that from R? I don't use perl myself (yet?).

Looking for recommendations.

Best

Ben Caldwell

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Need help setting up a mirror

2013-01-09 Thread Valerie Duncan

Hello.

I am trying to follow the instructions 
herehttp://mirror.fcaglp.unlp.edu.ar/CRAN/ to set up a mirror at my company 
as my developers are not allowed to go outside our firewall to download R 
packages.  I am not having much success in getting this to work.

Here is how I configured my httpd server:
VirtualHost *:80
   ServerName cran.xxx.xxx.xxx.org
   RewriteEngine on
   RewriteRule ^package=(.+) /web/packages/$1/index.html [R=seeother]
   RewriteRule ^view=(.+) /web/views/$1.html [R=seeother]
   DocumentRoot /ddd/ddd/ddd/ftp.ussg.iu.edu/CRAN /VirtualHost

Here is the directory structure of my Linux server for CRAN:
/opt/OSS/CRAN-Mirror/ftp.ussg.iu.edu/CRAN:
   bin
   doc
   src
   web

I don't find a banner.shtml anywhere or indeed anthing that looks like your 
front page.  I cannot get rsync to work so I used wget.  Here is the command I 
used:

  /usr/bin/wget -r -P '/ddd/ddd/ddd/' 
'http://ftp.ussg.iu.edu/CRAN/web/packages/available_packages_by_name.html'

 Can you see what I am doing wrong?

   Valerie
___
valerie duncan






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] writing to .xlsx

2013-01-09 Thread jim holtman

Can you use '.xls' format files? If so, XLConnect works pretty good
for those.  If you are using '.xlsx' format (zip files internally),
XLConnect takes much more CPU and memory to handle them.

On Wed, Jan 9, 2013 at 2:19 PM, Benjamin Caldwell
btcaldw...@berkeley.edu wrote:
 Dear r helpers;

 I'm interested in reading from and writing to large .xlsx files fairly
 regularly.  (Why, the naysayers may ask - and the answer is basically
 colleagues and clients who prefer that format). I've tried out the
 XLConnect and xlsx libraries, but the java implementation they use just
 takes too much RAM for the files I'm working with.

 gdata leverages perl and works really well for reading in those files, so
 half the problem is solved for me! I don't see anything in the
 documentation about writing .xlsx, though. Is anyone aware of any libraries
 or clever solutions in R that would get the job done for me? I see a couple
 packages on CPAN for writing an xlsx, so it's been done in perl; perhaps it
 would be easy to run that from R? I don't use perl myself (yet?).

 Looking for recommendations.

 Best

 Ben Caldwell

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R encrypt/decrypt

2013-01-09 Thread Thomas Lumley

I suggest looking at mcrypt.  There is a PHP module, and you could either
call out from R to the mcrypt program or use libmcrypt and C calls.It
supports AES, and other standard things.

There's no real saving of effort in using weaker ciphers, and you really
don't want to be implementing the processing yourself.

-thomas


On Thu, Jan 10, 2013 at 6:59 AM, Ramiro Barrantes 
ram...@precisionbioassay.com wrote:

 Hello,

 I am working on a web system (php) that uses R in the backend, and we need
 some basic fast encryption/decryption for the underlying mysql database
 that can be used by both R AND php.  It does not need to be
 top-of-the-line, but just provide some basic level of fast
 encryption/decryption.

 Any suggestions?

 Thank you,

 Ramiro

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] weighted factor analysis

2013-01-09 Thread Thomas Lumley

It depends on what sort of weights you have, but one approach is to
construct a weighted covariance matrix and then run factanal() on it.

That's what svyfactanal() in the survey package does.  The difficult part
is the tests: you need to specify the sample size, and in the presence of
weights it may not be clear what the right sample size is -- svyfactanal()
has four options, probably none of them is ideal.

-thomas


On Thu, Jan 10, 2013 at 5:36 AM, Virgile Capo-Chichi
vcapochi...@gmail.comwrote:

 hello there,
 I am trying to use a weight variable in a factor analysis but apparently
 the factanal command does not have a weight option. Any way to this? Thanks
 for your suggestions, V

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sweave, Texshop, and sync with included Rnw file

2013-01-09 Thread michele caseposta

Hello everyone.
I am in the process of writing a book in Latex with Texshop, on Mac.
This book contains a lot of R code, hence the need to use Sweave.
I was able to compile Rnw files, and to sync back and forth from the pdf to the 
source Rnw.
My problem now is that the book is divided in Chapters, and every chapter is in 
its own Rnw file.
I can compile them from the main one (book.Rnw) using the directive

\SweaveInput{chapter1.Rnw}

The problem stands in the fact that like this I am missing synchronization 
between the pdf and the source Rnw. If part of text is in book.Rnw I can 
synchronize, but if the text is in one of the included files, it just doesn't 
work.
I am using the sweave engine found in the following webpage:

http://cameron.bracken.bz/synctex-with-sweavepgfsweave-in-texshoptexworks

Has anybody succeeded in synchronizing with included Rnw files?

Thanks,
Mic
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] We've posted our 2013R Courses by XLSolutions Corp at 9 USA Cities: San Francisco, New York, Washington DC, Houston, Boston, Las Vegas, Seattle, etc

2013-01-09 Thread Sue Turner

Happy New Year !

XLSolutions January-February 2013 R courses schedule is now available
online at
9
USA cities for with 13 new courses: *** Suggest a future course
date/city

(1) R-PLUS: A Point-and-Click Approach to R
(2) S-PLUS / R : Programming Essentials.
(3) R/S+ Fundamentals and Programming Techniques
(4) R/S-PLUS Functions by Example.
(5) S/R-PLUS Programming 3: Advanced Techniques and Efficiencies.
(6) R/S+ System: Advanced Programming.
(7) R/S-PLUS Graphics: Essentials.
(8) R/S-PLUS Graphics for SAS Users
(9) R/S-PLUS Graphical Techniques for Marketing Research.
(10) Multivariate Statistical Methods in R/S-PLUS: Practical Research
Applications
(11) Introduction to Applied Econometrics with R/S-PLUS
(12) Exploratory Analysis for Large and Complex Problems in R/S-PLUS
(13) Determining Power and Sample Size Using R/S-PLUS.
(14) R/S-PLUS: Data Preparation for Data Mining
(15) Data Cleaning Techniques in R/S-PLUS
(16) R/S-PLUS: Applied Clustering Techniques


More on website

http://www.xlsolutions-corp.com/courselistlisting.aspx

Ask for group discount and reserve your seat Now - Earlybird Rates.
Payment due after the class! Email Sue Turner:  sue at
xlsolutions-corp.com

Phone: 206-686-1578


Please let us know if you and your colleagues are interested in this
class to take advantage of group discount. Register now to secure your
seat.

Cheers,
Elvis Miller, PhD
Manager Training.
XLSolutions Corporation
206 686 1578
www.xlsolutions-corp.com
elvis at xlsolutions-corp.com
http://www.xlsolutions-corp.com/courselistlisting.aspx

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Random Rectangles

2013-01-09 Thread David Arnold

Hi,

Just curious. Has anyone out there ever written a script to generate 100
random rectangles such as the ones shown on this page?

http://www2.math.umd.edu/~jlh/214/Random%20Rectangles.pdf

Thanks.

D.



--
View this message in context: 
http://r.789695.n4.nabble.com/Random-Rectangles-tp4655072.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R encrypt/decrypt

2013-01-09 Thread Suzen, Mehmet

On 9 January 2013 18:59, Ramiro Barrantes ram...@precisionbioassay.com wrote:
 I am working on a web system (php) that uses R in the backend, and we need 
 some basic fast encryption/decryption for the underlying mysql database that 
 can be used by both R AND php.  It does not need to be top-of-the-line, but 
 just provide some basic level of fast encryption/decryption.

 Any suggestions?

Sounds too generic. This is not really an R-help question.
Not sure what do you mean by underlying mysql. Are you going to
encrypt data into db? If it
is about transport between sql and web servers: these servers can be
configured to use SSL!
What is your aim?

BTW: Maybe you should remove php and use R directly via Rook;
http://cran.r-project.org/web/packages/Rook/index.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] writing to .xlsx

2013-01-09 Thread Gabor Grothendieck

On Wed, Jan 9, 2013 at 2:19 PM, Benjamin Caldwell
btcaldw...@berkeley.edu wrote:
 Dear r helpers;

 I'm interested in reading from and writing to large .xlsx files fairly
 regularly.  (Why, the naysayers may ask - and the answer is basically
 colleagues and clients who prefer that format). I've tried out the
 XLConnect and xlsx libraries, but the java implementation they use just
 takes too much RAM for the files I'm working with.

 gdata leverages perl and works really well for reading in those files, so
 half the problem is solved for me! I don't see anything in the
 documentation about writing .xlsx, though. Is anyone aware of any libraries
 or clever solutions in R that would get the job done for me? I see a couple
 packages on CPAN for writing an xlsx, so it's been done in perl; perhaps it
 would be easy to run that from R? I don't use perl myself (yet?).

 Looking for recommendations.

 Best

 Ben Caldwell

Check out
http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windowss=excel
and in particular the WriteXLS package can write Excel 2003 files
(xls) using perl.


--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] graphical distance matrix

2013-01-09 Thread eliza botto


Dear R-family,
I made a distance matrix of about 2000 stations. its extremely hard to 
visualize the details of that matrix. I heard that there is a way in R to 
represent the details of distance matrix graphically. more precisely, different 
sections of our distance matrix can be presented in different colors. low 
values be presented in light colors and high values in dark. is there really a 
way of doing it??
thanks in advance
regards
elisa 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R encrypt/decrypt

2013-01-09 Thread Ramiro Barrantes

Dear Suzen,

Thank you for your reply.  What I meant was that some fields in the database 
will be encrypted (the data for those fields will be entered via php web 
interface and then encrypted and stored on the mysql db), and then I will use R 
to read such database and do appropriate post-processing, which will then need 
to be encrypted and stored into the mysql db (with R hopefully).  In other 
words, I have a shared mysql database with some encrypted fields, and I need R 
and php to both understand the encryption/decryption.   

I thought this would be an appropriate question for the group as perhaps 
someone might know of an R encrypt/decrypt mechanism that also has a 
counterpart on php or has suggestions about the situation.  Sorry for the 
confusion in my question.

Thank you,
Ramiro





From: mehmet.su...@gmail.com [mehmet.su...@gmail.com] on behalf of Suzen, 
Mehmet [msu...@gmail.com]
Sent: Wednesday, January 09, 2013 3:38 PM
To: Ramiro Barrantes
Cc: r-help@r-project.org
Subject: Re: [R] R encrypt/decrypt

On 9 January 2013 18:59, Ramiro Barrantes ram...@precisionbioassay.com wrote:
 I am working on a web system (php) that uses R in the backend, and we need 
 some basic fast encryption/decryption for the underlying mysql database that 
 can be used by both R AND php.  It does not need to be top-of-the-line, but 
 just provide some basic level of fast encryption/decryption.

 Any suggestions?

Sounds too generic. This is not really an R-help question.
Not sure what do you mean by underlying mysql. Are you going to
encrypt data into db? If it
is about transport between sql and web servers: these servers can be
configured to use SSL!
What is your aim?

BTW: Maybe you should remove php and use R directly via Rook;
http://cran.r-project.org/web/packages/Rook/index.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave, Texshop, and sync with included Rnw file

2013-01-09 Thread Yihui Xie

I believe RStudio has done a fairly good job in terms of the
synchronization. If you have to stick to TeXShop, I do not have any
ideas on how to make it work with Sweave child documents.

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA


On Wed, Jan 9, 2013 at 2:25 PM, michele caseposta mic.c...@gmail.com wrote:
 Hello everyone.
 I am in the process of writing a book in Latex with Texshop, on Mac.
 This book contains a lot of R code, hence the need to use Sweave.
 I was able to compile Rnw files, and to sync back and forth from the pdf to 
 the source Rnw.
 My problem now is that the book is divided in Chapters, and every chapter is 
 in its own Rnw file.
 I can compile them from the main one (book.Rnw) using the directive

 \SweaveInput{chapter1.Rnw}

 The problem stands in the fact that like this I am missing synchronization 
 between the pdf and the source Rnw. If part of text is in book.Rnw I can 
 synchronize, but if the text is in one of the included files, it just doesn't 
 work.
 I am using the sweave engine found in the following webpage:

 http://cameron.bracken.bz/synctex-with-sweavepgfsweave-in-texshoptexworks

 Has anybody succeeded in synchronizing with included Rnw files?

 Thanks,
 Mic

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] incrementation within ifelse

2013-01-09 Thread Adams, Jean

Damien,

You don't give an example of what your data frame looks like or what you
want the new column to look like (given that example data), but I created
an example data frame for z, and wrote a few lines of code to add a new
column.  Check it out and see if it comes close to doing what you want.

first - function(x) c(1, 1-(x[-1]==x[-length(x)]))
n - 25
z - data.frame(flagFoehn3_durr=sample(0:2, n, TRUE), Guetsch=sample(0:2,
n, TRUE))
z$newColumn - cumsum(first(z$flagFoehn3_durr==1)  z$flagFoehn3_durr==1)
z$newColumn[z$flagFoehn3_durr!=1] - 0

Jean




On Tue, Jan 8, 2013 at 12:33 PM, Damien Pilloud damien.pill...@gmail.comwrote:

 Dear R-helper,

 I am working on a very large data frame and I am trying to add a new column
 and write in it with certain conditions. I have try to use this code with
 the data frame p :

 ID = 0

 p[,newColumn]-
 ifelse (p$flagFoehn3_durr == 1,
 ifelse(p$Guetsch == 0,
 ID - ID ++
 ,
 ID
 )
 ,
 0
 )

 What I am trying to do is to increment the ID when p$Guetsch == 0 and to
 put this result in the column. The problem is that ID does not increment
 itself.

 An other way is to use a loop for like this example :

 ID = 0
 for (s in 1:(nrow(z))){

 z[s,newColumn]-
 if (z$flagFoehn3_durr[s] == 1){
 if(z$flagFoehn3_durr[s-1] == 0){
 ID -ID+1
 }else{
 ID
 }
 }else{
 0
 }
 }

 This work perfectly, but the problem is that it will take me more than a
 month to run it.

 Is there a way to increment with the first code I used or a way of running
 the second code faster (I have more than 1 million rows)

 Thanks!

 Cheers,

 Damien

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to estate the correlation between two autocorrelated variables

2013-01-09 Thread Zhiqiu Hu

Dear R users,

In my data, there are two variables t1 and t2. For each observation of t1
and t2, two location indicators (x, y) were provided.

The data format is
#x   y   t1   t2

Since the both t1 and t2 are depended on x and y, t1 and t2 are
autocorrelated variables. My question is how to calculate the correlation
between t1 and t2 by taking into account the structure of residual variance
caused by x and y. Seemly, the gls function in nlme/R package might can be
used for the purpose. However, I failed to figure out how to use the
function for my data. I appreciate your kind help providing an example code
for the above data format. Please also let me know if there is any other
more suitable R package for the analysis.

Best regards,

Zhiqiu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave, Texshop, and sync with included Rnw file

2013-01-09 Thread michele caseposta

Hi Yihui,
yes, RStudio works flawlessly with synchronization, but working with it I will 
lose all the features of a full-fledged tex editor, first of all BibDesk 
integration.
Texworks can also do two-way sync in Rnw, and I tried to switch configurations 
but with no luck.
In texworks I am using RScript with the options recommended in this document:

http://www.math.montana.edu/~jimrc/classes/Rseminar/TexWorks.pdf

The problem with texworks is the same: no integration with bibdesk
 


On Jan 9, 2013, at 4:06 PM, Yihui Xie wrote:

 I believe RStudio has done a fairly good job in terms of the
 synchronization. If you have to stick to TeXShop, I do not have any
 ideas on how to make it work with Sweave child documents.
 
 Regards,
 Yihui
 --
 Yihui Xie xieyi...@gmail.com
 Phone: 515-294-2465 Web: http://yihui.name
 Department of Statistics, Iowa State University
 2215 Snedecor Hall, Ames, IA
 
 
 On Wed, Jan 9, 2013 at 2:25 PM, michele caseposta mic.c...@gmail.com wrote:
 Hello everyone.
 I am in the process of writing a book in Latex with Texshop, on Mac.
 This book contains a lot of R code, hence the need to use Sweave.
 I was able to compile Rnw files, and to sync back and forth from the pdf to 
 the source Rnw.
 My problem now is that the book is divided in Chapters, and every chapter is 
 in its own Rnw file.
 I can compile them from the main one (book.Rnw) using the directive
 
 \SweaveInput{chapter1.Rnw}
 
 The problem stands in the fact that like this I am missing synchronization 
 between the pdf and the source Rnw. If part of text is in book.Rnw I can 
 synchronize, but if the text is in one of the included files, it just 
 doesn't work.
 I am using the sweave engine found in the following webpage:
 
 http://cameron.bracken.bz/synctex-with-sweavepgfsweave-in-texshoptexworks
 
 Has anybody succeeded in synchronizing with included Rnw files?
 
 Thanks,
 Mic

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R encrypt/decrypt

2013-01-09 Thread Suzen, Mehmet

Hello Ramiro,

I am still not sure why do you need to encrypt/decrypt data in R.
One can encrypt/decrypt data in the SQL server side.

https://dev.mysql.com/doc/refman/5.5/en/encryption-functions.html

If your concern is on the web traffic, again, sql servers supports SSL

http://dev.mysql.com/doc/refman/5.1/en/ssl-connections.html

I think RMySQL can connect via SSL. Also you may consider RCurl to talk to
your php code.

Best,
-m


On 9 January 2013 21:54, Ramiro Barrantes ram...@precisionbioassay.com wrote:
 Dear Suzen,

 Thank you for your reply.  What I meant was that some fields in the database 
 will be encrypted (the data for those fields will be entered via php web 
 interface and then encrypted and stored on the mysql db), and then I will use 
 R to read such database and do appropriate post-processing, which will then 
 need to be encrypted and stored into the mysql db (with R hopefully).  In 
 other words, I have a shared mysql database with some encrypted fields, and I 
 need R and php to both understand the encryption/decryption.

 I thought this would be an appropriate question for the group as perhaps 
 someone might know of an R encrypt/decrypt mechanism that also has a 
 counterpart on php or has suggestions about the situation.  Sorry for the 
 confusion in my question.

 Thank you,
 Ramiro




 
 From: mehmet.su...@gmail.com [mehmet.su...@gmail.com] on behalf of Suzen, 
 Mehmet [msu...@gmail.com]
 Sent: Wednesday, January 09, 2013 3:38 PM
 To: Ramiro Barrantes
 Cc: r-help@r-project.org
 Subject: Re: [R] R encrypt/decrypt

 On 9 January 2013 18:59, Ramiro Barrantes ram...@precisionbioassay.com 
 wrote:
 I am working on a web system (php) that uses R in the backend, and we need 
 some basic fast encryption/decryption for the underlying mysql database that 
 can be used by both R AND php.  It does not need to be top-of-the-line, but 
 just provide some basic level of fast encryption/decryption.

 Any suggestions?

 Sounds too generic. This is not really an R-help question.
 Not sure what do you mean by underlying mysql. Are you going to
 encrypt data into db? If it
 is about transport between sql and web servers: these servers can be
 configured to use SSL!
 What is your aim?

 BTW: Maybe you should remove php and use R directly via Rook;
 http://cran.r-project.org/web/packages/Rook/index.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Basic loop programming

2013-01-09 Thread MacQueen, Don

Yes, R is a different language, and has different syntax and different
built-in functions, so, yes it works differently.

If you want to do it the same way in R as in that other language, you have
to use a different method for constructing the variable names inside the
loop. Here's an example, using the get() and assign() functions to
construct the variable names, essentially replacing your constructions
like  Customers_2012_'i'.

I have four variables, named
  s01, s02
  c01, c02
(sales and customers for two months)

Something like this should do it:

for (i in c('01','02')) {
  assign( paste0('r',i) ,
  get(paste0('s',i))/get(paste0('c',i))
  )
}

I should now have new variables r01 and r02.

This is not tested, so hopefully I got all the parentheses matched.

Of course, that looks cumbersome and ugly, and it is. There are other ways
in R to store your data, for which the code will be much friendlier.

If you use
   i in 1:12
you are creating numbers, but your variable names use character strings,
'01','02', etc. So, no, you can't use
  i in 1:12
directly. But you can use i in 1:12 if you use a formatting function on i
to convert it to a character string with leading zeros. The formatC
function is one such function; there are others.

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 1/9/13 8:02 AM, Paolo Donatelli donatellipa...@gmail.com wrote:

Hi all,

newbie question: I am trying to set up a very simple loop without
succeeding.

Let's say I have monthly observation of two variables for a year
- Sales_2012_01, Sales_2012_02, Sales_2012_03,   (total sales
for jan 2012,feb 2012, etc.)
- Customers_2012_01, Customers_2012_02,   (total number of
customers for jan 2012, etc.)

and I want to create new monthly variables in order to compute
revenues per customers:

Av_revenue_2012_01 = Sales_2012_01 / Customers_2012_01
Av_revenue_2012_02 = Sales_2012_02 / Customers_2012_02
...

how can I proceed?


In other programming language I used just to write something like
for (i in list(01,02, ..., 12) {
Av_revenue_2012_'i' = Sales_2012_'i' / Customers_2012_'i'
}

but in R it seems not to work like that. Further, and correct me if I
am wrong, I cannot use simple (i in 1:12) since I have a 0 digit in
front of the single-digit months.

thanks in advance for your help

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] GLMM post- hoc comparisons

2013-01-09 Thread Helios de Rosario

 El día 08/01/2013 a las 12:40, Silvina Velez
sve...@mendoza-conicet.gob.ar
escribió:
 Hi All,
 I have data about seed predation (SP) in fruits of three differents
colors 
 (yellow, motted, dark) and in two fruiting seasons (2007, 2008). I
performed 
 a GLMM (lmer function, lme4 package) and the outcome showed that the

 interaction term (color:season) was significant, and some
combinations of 
 this interaction have significant Pr(|z|), but I don't think they
are the 
 right significant combinations, because when I look the bwplot, this

 combinations seems to be very different from the other ones. So, I
would like 
 to know if there is any test a posteriori to know the p-values for
each 
 combination of color:season, and thereby be able to know what
conbination/s 
 is/are really significant.
 
 m1=lmer(SP ~ color + season:color +(1|Site:tree), data=datosfl, 
 family=poisson)
 AIC   BIC logLik deviance
 178.3 196.6 -81.14162.3
 Random effects:
 Groups  NameVariance Std.Dev.
 obsBR   (Intercept) 0.064324 0.25362 
 Site:tree   (Intercept) 0.266490 0.51623 
 Number of obs: 73, groups: obsBR, 73; Site:tree, 37
 
 Estimate Std. Error z value Pr(|z|)
 (Intercept)2.5089 0.2750   9.125   2e-16 ***
 colorM-0.1140 0.3242  -0.352   0.7250
 colorD-0.6450 0.4178  -1.544   0.1227
 Season2008-0.7343 0.3104  -2.365   0.0180 *  
 colorM:Season2008  0.2505 0.4352   0.576   0.5648
 colorD:Season2008  1.1445 0.5747   1.992   0.0464 * 

Hi Silvina,

What do you exactly mean with what combination(s) is/are significant?
If you mean what combinations have significantly greater SP than the
baseline combination (yellow:2007), the table that you have copied may
be what you actually want. If you want to test other contrasts between
color:season combinations, perhaps you can use the function
testInteractions() from package phia. For instance:

testInteractions(m1)

will give you a test of all the pairwise contrasts between color and
season. You can also test simple main effects, or other specific
contrasts by adding further arguments (see the documentation and the
package vignette). Anyway, the calculation of p-values in mixed models
must always be taken with care.

Helios De Rosario-Martinez
Instituto de Biomecánica de Valencia



INSTITUTO DE BIOMECÁNICA DE VALENCIA
Universidad Politécnica de Valencia • Edificio 9C
Camino de Vera s/n • 46022 VALENCIA (ESPAÑA)
Tel. +34 96 387 91 60 • Fax +34 96 387 91 69
www.ibv.org

  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección
de Datos de Carácter Personal, le informamos de que el presente mensaje
contiene información confidencial, siendo para uso exclusivo del
destinatario arriba indicado. En caso de no ser usted el destinatario
del mismo le informamos que su recepción no le autoriza a su divulgación
o reproducción por cualquier medio, debiendo destruirlo de inmediato,
rogándole lo notifique al remitente.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave, Texshop, and sync with included Rnw file

2013-01-09 Thread Duncan Mackay


Hi

Perhaps you need to make a master file and call the chapter files from it

eg (just copying the relevant section from my master GClimate12.Rnw file)


latex preliminaries + R options + begin

% plots
\SweaveInput{GClimate12RX.Rnw}
% Soil 300 % 15
\SweaveInput{GClimate12SP.Rnw}
% proportions  % 16
\SweaveInput{GClimate12RS.Rnw}
% cumsum days % 17
\SweaveInput{GClimate12RC.Rnw}
closing commands etc
end document

This means that you require 1 setup page rather than individual ones 
and the benefits attached for TOC etc


Regards

Duncan

Duncan Mackay
Department of Agronomy and Soil Science
University of New England
Armidale NSW 2351
Email: home: mac...@northnet.com.au

At 06:25 10/01/2013, you wrote:

Hello everyone.
I am in the process of writing a book in Latex with Texshop, on Mac.
This book contains a lot of R code, hence the need to use Sweave.
I was able to compile Rnw files, and to sync back and forth from the 
pdf to the source Rnw.
My problem now is that the book is divided in Chapters, and every 
chapter is in its own Rnw file.

I can compile them from the main one (book.Rnw) using the directive

\SweaveInput{chapter1.Rnw}

The problem stands in the fact that like this I am missing 
synchronization between the pdf and the source Rnw. If part of text 
is in book.Rnw I can synchronize, but if the text is in one of the 
included files, it just doesn't work.

I am using the sweave engine found in the following webpage:

http://cameron.bracken.bz/synctex-with-sweavepgfsweave-in-texshoptexworks

Has anybody succeeded in synchronizing with included Rnw files?

Thanks,
Mic
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using objects within functions in formulas

2013-01-09 Thread Ben Bolker

David Winsemius dwinsemius at comcast.net writes:

 On Jan 9, 2013, at 8:53 AM, Aidan MacNamara wrote:

  I'm looking to create a formula within a function to pass to glmer()
  and I'm having a problem that the following example will illustrate:
 
  library(lme4)
  y1 = rnorm(10)
  x1 = data.frame(x11=rnorm(10), x12=rnorm(10), x13=rnorm(10))
  x1 = data.matrix(x1)
  w1 = data.frame(w11=sample(1:3,10, replace=TRUE), w12=sample(1:3,10,
  replace=TRUE), w13=sample(1:3,10, replace=TRUE))
 
  test1 - function(x2, y2, w2) {
 
  print(str(w2))
  form = as.formula(paste(y2 ~ x2 + ,paste((1|w2$, names(w2), ),
  collapse= + , sep=)))
  m1 = glmer(form)
  return(m1)
  }
 
  model1 = test1(x2=x1, y2=y1, w2=w1)
 
  As can be seen from the print statement within the function, the
  object w2 is present and is a data frame. However, the following
  error occurs:
 
  Error in is.factor(x) : object 'w2' not found
 

[snip David's solution to try to make gmane happy about the amount
of quoted material]

  This can be rectified by making 'w2' global - defining it outside the
  function. I know there are issues with defining formulas and
  environment but I'm not sure why this problem is specific to 'w2' and
  not the other objects passed to the function.
 
  Any help would be appreciated.
 
  Aidan MacNamara
  EMBL-EBI

  I haven't had a chance to look at this, but I will try to get to it.
It would help if you could post it on the Issues page of the lme4
github site, https://github.com/lme4/lme4/ .  The bottom line is that
dealing appropriately with all the different possible ways to assign
and evaluate variables within formulas is trickier than I would like
it to be.  To the best of my knowledge I have solved most of these
problems in the development version of lme4, but another test case
will be useful.  As long as there is a reasonable workaround I'm unlikely
to put the effort into fixing the stable version of lme4 (sorry ...) 

  Follow-ups to r-sig-mixed-mod...@r-project.org or (preferably)
to the aforementioned Issues list.

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] writing to .xlsx

2013-01-09 Thread Marc Schwartz


On Jan 9, 2013, at 2:45 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote:

 On Wed, Jan 9, 2013 at 2:19 PM, Benjamin Caldwell
 btcaldw...@berkeley.edu wrote:
 Dear r helpers;
 
 I'm interested in reading from and writing to large .xlsx files fairly
 regularly.  (Why, the naysayers may ask - and the answer is basically
 colleagues and clients who prefer that format). I've tried out the
 XLConnect and xlsx libraries, but the java implementation they use just
 takes too much RAM for the files I'm working with.
 
 gdata leverages perl and works really well for reading in those files, so
 half the problem is solved for me! I don't see anything in the
 documentation about writing .xlsx, though. Is anyone aware of any libraries
 or clever solutions in R that would get the job done for me? I see a couple
 packages on CPAN for writing an xlsx, so it's been done in perl; perhaps it
 would be easy to run that from R? I don't use perl myself (yet?).
 
 Looking for recommendations.
 
 Best
 
 Ben Caldwell
 
 Check out
 http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windowss=excel
 and in particular the WriteXLS package can write Excel 2003 files
 (xls) using perl.
 

Thanks for the referral Gabor.

If Benjamin needs the xlsx format due to the larger dimensions supported, 
WriteXLS, since it writes xls format files, would not likely be suitable. 
Otherwise, of course, current versions of Excel can open the older format.

If Benjamin simply needs to dump larger (for some definition of larger) 
datasets externally in format that is compatible with Excel, he could write out 
CSV files that, of course, can then be opened in Excel. That presumes that he 
is not looking to do any other formatting of the worksheets or other similar 
functionality that is native to Excel.

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to estate the correlation between two autocorrelated variables

2013-01-09 Thread Nordlund, Dan (DSHS/RDA)

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Zhiqiu Hu
 Sent: Wednesday, January 09, 2013 1:45 PM
 To: r-help@r-project.org
 Subject: [R] How to estate the correlation between two autocorrelated
 variables

 Dear R users,

 In my data, there are two variables t1 and t2. For each observation of
 t1
 and t2, two location indicators (x, y) were provided.

 The data format is
 #x   y   t1   t2

 Since the both t1 and t2 are depended on x and y, t1 and t2 are
 autocorrelated variables. My question is how to calculate the
 correlation
 between t1 and t2 by taking into account the structure of residual
 variance
 caused by x and y. Seemly, the gls function in nlme/R package might can
 be
 used for the purpose. However, I failed to figure out how to use the
 function for my data. I appreciate your kind help providing an example
 code
 for the above data format. Please also let me know if there is any
 other
 more suitable R package for the analysis.

 Best regards,

 Zhiqiu

If you want the partial correlation between t1 and t2 given x and y, then look 
at the pcor() function in the ppcor package.

Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] writing to .xlsx

2013-01-09 Thread Benjamin Caldwell

Folks,

Thanks for your input. I'm pretty comfortable with the options for writing
to .xls; I'm interested in

1. Something that can write to .xlsx for the larger supported dimensions,
as Marc guessed; but of course he's right that .csv would work very well if
that was the main goal.

I'm really looking for

2. something I can use to write a new sheet (and/or columns within existing
sheets) of .xlsx workbooks so I can more easily work with colleagues who're
using those workbooks. It's a bit unwieldy, but they like to use vb, which
its shortcomings I don't have to go into here, and I'd like to get in and
out of their workbooks without all the current cumbersome open - export as
.csv - read in - export as .csv - copy into workbook.

The other option would be to convert them to using R, but so far no luck
there!

Thanks again

*Ben Caldwell*

PhD Candidate
University of California, Berkeley
130 Mulford Hall #3114
Berkeley, CA 94720
Office 223 Mulford Hall
(510)859-3358


On Wed, Jan 9, 2013 at 2:03 PM, Marc Schwartz marc_schwa...@me.com wrote:


 On Jan 9, 2013, at 2:45 PM, Gabor Grothendieck ggrothendi...@gmail.com
 wrote:

  On Wed, Jan 9, 2013 at 2:19 PM, Benjamin Caldwell
  btcaldw...@berkeley.edu wrote:
  Dear r helpers;
 
  I'm interested in reading from and writing to large .xlsx files fairly
  regularly.  (Why, the naysayers may ask - and the answer is basically
  colleagues and clients who prefer that format). I've tried out the
  XLConnect and xlsx libraries, but the java implementation they use just
  takes too much RAM for the files I'm working with.
 
  gdata leverages perl and works really well for reading in those files,
 so
  half the problem is solved for me! I don't see anything in the
  documentation about writing .xlsx, though. Is anyone aware of any
 libraries
  or clever solutions in R that would get the job done for me? I see a
 couple
  packages on CPAN for writing an xlsx, so it's been done in perl;
 perhaps it
  would be easy to run that from R? I don't use perl myself (yet?).
 
  Looking for recommendations.
 
  Best
 
  Ben Caldwell
 
  Check out
  http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windowss=excel
  and in particular the WriteXLS package can write Excel 2003 files
  (xls) using perl.
 

 Thanks for the referral Gabor.

 If Benjamin needs the xlsx format due to the larger dimensions supported,
 WriteXLS, since it writes xls format files, would not likely be suitable.
 Otherwise, of course, current versions of Excel can open the older format.

 If Benjamin simply needs to dump larger (for some definition of larger)
 datasets externally in format that is compatible with Excel, he could write
 out CSV files that, of course, can then be opened in Excel. That presumes
 that he is not looking to do any other formatting of the worksheets or
 other similar functionality that is native to Excel.

 Regards,

 Marc Schwartz



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to count A, C, T, G in each row in a big data.frame?

2013-01-09 Thread Yao He

In fact I want to calculate the gene frequency of each SNP.

The key problems are that:
1. my data.frame is large ,about 50,000 rows. So it is so slow to
split() it by row

2 .The allele in each SNP (each row) are different.Some are A/G, some
are G/C. It is a little bit embarrassed for me to handle it.

Thank you for your help

2013/1/9 jim holtman jholt...@gmail.com:
 forgot the data.  this will count the characters; you can add logic
 with 'table' to count groups

 
 x -
 structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565,
 Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238,
 Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581,
 Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373,
 Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685,
 Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L,
 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L,
 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L,
 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L,
 3619538L), strand = c(+, +, +, +, +, +, +, +,
 +, +, +, +, +, +, +, +, +, +, +, +),
 X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT,
 CC, GG, AG, AG, AG, TT, CC, AG, CC, AA,
 GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC,
 CC, TT, CC, GG, GG, AG, AG, TT, CC, AG,
 CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC,
 TT, CC, CC, TT, CC, GG, GG, GG, GG, GT,
 CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT,
 TT, CC, TT, CC, CC, TT, CC, GG, AA, AG,
 GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
 GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
 AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
 GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG,
 TC, TT, CC, TC, CC, CC, TT, CC, GG, AA,
 GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
 GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA,
 CC, TT, CC, CC, CC, CC, TT, CC, AG, GG,
 AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
 GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
 GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA,
 TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
 GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA,
 CC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
 GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
 GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
 AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
 AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA,
 TC, TT, CC, TC, CC, CC, TT, CC, GA, GA,
 GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG,
 CT, TT, CC, CT, CC, CC, TT, CC, GG, AG,
 GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA,
 CT, TT, CC, CT, CC, CC, TT, CC, GG, GA,
 GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
 GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA,
 TC, TT, CC, TT, CC, CC, TT, CC, GG, GA,
 GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA,
 TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
 AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG,
 CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
 GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA,
 CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
 GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
 GG, GG, GT, TC, AG, CC, AA, AA, AG), X2534 = c(GA,
 TC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
 AG, GG, TG, CC, AG, TC, AA, AA, AA), X2280 = c(AA,
 TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
 AG, GG, TT, CC, GG, CC, AA, AA, AG), X2316 = c(AA,
 CC, TT, CC, CC, CC, CC, TT, CC, AG, AA,
 AA, AG, TT, TC, GG, CT, AA, GG, GG), X2339 = c(AA,
 CC, TT, CC, CC, CC, CC, TT, CC, GA, AA,
 GG, GG, GT, CT, GG, TT, AA, AA, AG), X2331 = c(AA,
 TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
 GG, GG, TT, CC, GG, CC, AA, AA, AG), X2343 = c(AA,
 TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
 GG, GG, TT, CT, GG, CC, AA, AA, GA), X2352 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
 GG, GG, TT, CC, GG, CC, AA, GA, AG), X2293 = c(GA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
 AA, GG, TT, TC, AA, CT, AA, AA, AA), X2338 = c(GA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
 AG, GG, TT, TC, AG, TC, AA, AA, GA), X2449 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, AG,
 AA, GG, TT, CC, AA, TC, AA, AA, GA), X2296 = c(GA,
 TT, TT, CC, TT, CC, CC, TT, CC, GA, GG,
 AG, GG, TG, TC, AG, CC, AA, AA, AA), X2453 = c(AG,

Re: [R] how to count A, C, T, G in each row in a big data.frame?

2013-01-09 Thread Yao He

Thanks a lot.

The problem is that I don't know how to handle the output list as I
want calculate the frequency of A or G or T or C by row.


Yao He
2013/1/10 Jessica Streicher j.streic...@micromata.de:
 Sorry, you wanted rows, i wrote for columns

 #rows would be:
 test2-apply(test[,-c(1:4)],1,function(x){table(t(x))})

 #find single values in a row
 sapply(test2,function(row){
 allVars-paste(names(row),collapse=)
 u - unique(strsplit(allVars,)[[1]])
 parts-sapply(names(row),function(x){u%in%strsplit(x,)[[1]]})
 mat-parts%*%row
 rownames(mat)-u
 mat
 })

 though i guess lists aren't ideal, but theres another answer as well i see.

 On 09.01.2013, at 15:23, Yao He wrote:

 Dear All

 I have a data.frame like that:
 structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565,
 Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238,
 Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581,
 Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373,
 Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685,
 Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L,
 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L,
 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L,
 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L,
 3619538L), strand = c(+, +, +, +, +, +, +, +,
 +, +, +, +, +, +, +, +, +, +, +, +),
X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT,
CC, GG, AG, AG, AG, TT, CC, AG, CC, AA,
GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC,
CC, TT, CC, GG, GG, AG, AG, TT, CC, AG,
CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC,
TT, CC, CC, TT, CC, GG, GG, GG, GG, GT,
CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT,
TT, CC, TT, CC, CC, TT, CC, GG, AA, AG,
GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG,
TC, TT, CC, TC, CC, CC, TT, CC, GG, AA,
GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA,
CC, TT, CC, CC, CC, CC, TT, CC, AG, GG,
AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA,
CC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GA, GA,
GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG,
CT, TT, CC, CT, CC, CC, TT, CC, GG, AG,
GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA,
CT, TT, CC, CT, CC, CC, TT, CC, GG, GA,
GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA,
TC, TT, CC, TT, CC, CC, TT, CC, GG, GA,
GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG,
CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA,
CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
GG, GG, GT, TC, AG, CC, AA, AA, AG), X2534 = c(GA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
AG, GG, TG, CC, AG, TC, AA, AA, AA), X2280 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
AG, GG, TT, CC, GG, CC, AA, AA, AG), X2316 = c(AA,
CC, TT, CC, CC, CC, CC, TT, CC, AG, AA,
AA, AG, TT, TC, GG, CT, AA, GG, GG), X2339 = c(AA,
CC, TT, CC, CC, CC, CC, TT, CC, GA, AA,
GG, GG, GT, CT, GG, TT, AA, AA, AG), X2331 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
GG, GG, TT, CC, GG, CC, AA, AA, AG), X2343 = c(AA,
TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
GG, GG, TT, CT, GG, CC, AA, AA, GA), X2352 = c(AA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
GG, GG, TT, CC, GG, CC, AA, GA, AG), X2293 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
AA, GG, TT, TC, AA, CT, AA, AA, AA), X2338 = c(GA,
TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
AG, GG, TT, TC, AG, TC, AA, AA, GA), X2449 = c(AA,

Re: [R] how to count A, C, T, G in each row in a big data.frame?

2013-01-09 Thread William Dunlap

Can you get what you need from the following, where 'd' is your data.frame,
the first four columns of which are irrelevant to this problem?
   dd - d[,-(1:4)] ; table(rownames(dd)[row(dd)], unlist(dd))
 
  AA AG CC CT GA GG GT TC TG TT
27412 29 10  0  0 13  1  0  0  0  0
27413  0  0  4  9  0  0  0 12  0 28
27414  0  0  0  0  0  0  0  0  0 53
27415  0  0 53  0  0  0  0  0  0  0
...
27430 46  3  0  0  2  2  0  0  0  0
27431 19 15  0  0 15  4  0  0  0  0
table() is pretty quick.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Yao He
 Sent: Wednesday, January 09, 2013 4:04 PM
 To: jim holtman
 Cc: R help
 Subject: Re: [R] how to count A, C, T, G in each row in a big 
 data.frame?
 
 In fact I want to calculate the gene frequency of each SNP.
 
 The key problems are that:
 1. my data.frame is large ,about 50,000 rows. So it is so slow to
 split() it by row
 
 2 .The allele in each SNP (each row) are different.Some are A/G, some
 are G/C. It is a little bit embarrassed for me to handle it.
 
 Thank you for your help
 
 2013/1/9 jim holtman jholt...@gmail.com:
  forgot the data.  this will count the characters; you can add logic
  with 'table' to count groups
 
  
  x -
  structure(list(name = c(Gga_rs10722041, Gga_rs10722249, 
  Gga_rs10722565,
  Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238,
  Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581,
  Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373,
  Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685,
  Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
  7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L,
  20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L,
  36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L,
  2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L,
  3619538L), strand = c(+, +, +, +, +, +, +, +,
  +, +, +, +, +, +, +, +, +, +, +, +),
  X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT,
  CC, GG, AG, AG, AG, TT, CC, AG, CC, AA,
  GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC,
  CC, TT, CC, GG, GG, AG, AG, TT, CC, AG,
  CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC,
  TT, CC, CC, TT, CC, GG, GG, GG, GG, GT,
  CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT,
  TT, CC, TT, CC, CC, TT, CC, GG, AA, AG,
  GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
  GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG,
  TC, TT, CC, TC, CC, CC, TT, CC, GG, AA,
  GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA,
  CC, TT, CC, CC, CC, CC, TT, CC, AG, GG,
  AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
  GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA,
  TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
  GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA,
  CC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
  GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
  GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
  AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA,
  TC, TT, CC, TC, CC, CC, TT, CC, GA, GA,
  GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG,
  CT, TT, CC, CT, CC, CC, TT, CC, GG, AG,
  GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA,
  CT, TT, CC, CT, CC, CC, TT, CC, GG, GA,
  GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
  GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA,
  TC, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA,
  TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
  AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG,
  CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
  GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA,
  CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
  GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
  GG, GG, GT, TC, AG, CC, AA, AA, AG), X2534 = c(GA,
  TC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
  AG, GG, TG, CC, AG, TC, AA, AA, AA), X2280 = c(AA,
  TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
  AG, GG, TT,

Re: [R] how to count A, C, T, G in each row in a big data.frame?

2013-01-09 Thread Yao He

It is really a good output. Maybe I could go on with this output.
Everytime I  understand R further from your help.
The first four cols are irrelevant. It is a negligence

2013/1/10 William Dunlap wdun...@tibco.com:
 Can you get what you need from the following, where 'd' is your data.frame,
 the first four columns of which are irrelevant to this problem?
dd - d[,-(1:4)] ; table(rownames(dd)[row(dd)], unlist(dd))

   AA AG CC CT GA GG GT TC TG TT
 27412 29 10  0  0 13  1  0  0  0  0
 27413  0  0  4  9  0  0  0 12  0 28
 27414  0  0  0  0  0  0  0  0  0 53
 27415  0  0 53  0  0  0  0  0  0  0
 ...
 27430 46  3  0  0  2  2  0  0  0  0
 27431 19 15  0  0 15  4  0  0  0  0
 table() is pretty quick.

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Yao He
 Sent: Wednesday, January 09, 2013 4:04 PM
 To: jim holtman
 Cc: R help
 Subject: Re: [R] how to count A, C, T, G in each row in a big 
 data.frame?

 In fact I want to calculate the gene frequency of each SNP.

 The key problems are that:
 1. my data.frame is large ,about 50,000 rows. So it is so slow to
 split() it by row

 2 .The allele in each SNP (each row) are different.Some are A/G, some
 are G/C. It is a little bit embarrassed for me to handle it.

 Thank you for your help

 2013/1/9 jim holtman jholt...@gmail.com:
  forgot the data.  this will count the characters; you can add logic
  with 'table' to count groups
 
  
  x -
  structure(list(name = c(Gga_rs10722041, Gga_rs10722249, 
  Gga_rs10722565,
  Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238,
  Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581,
  Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373,
  Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685,
  Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
  7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L,
  20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L,
  36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L,
  2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L,
  3619538L), strand = c(+, +, +, +, +, +, +, +,
  +, +, +, +, +, +, +, +, +, +, +, +),
  X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT,
  CC, GG, AG, AG, AG, TT, CC, AG, CC, AA,
  GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC,
  CC, TT, CC, GG, GG, AG, AG, TT, CC, AG,
  CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC,
  TT, CC, CC, TT, CC, GG, GG, GG, GG, GT,
  CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT,
  TT, CC, TT, CC, CC, TT, CC, GG, AA, AG,
  GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
  GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG,
  TC, TT, CC, TC, CC, CC, TT, CC, GG, AA,
  GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA,
  CC, TT, CC, CC, CC, CC, TT, CC, AG, GG,
  AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
  GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA,
  TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
  GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA,
  CC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
  GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
  GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
  AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA,
  TC, TT, CC, TC, CC, CC, TT, CC, GA, GA,
  GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG,
  CT, TT, CC, CT, CC, CC, TT, CC, GG, AG,
  GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA,
  CT, TT, CC, CT, CC, CC, TT, CC, GG, GA,
  GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
  GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA,
  TC, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA,
  TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
  AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG,
  CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
  GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA,
  CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
  GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,

[R] SRS, Stratified, and Cluster sampling

2013-01-09 Thread David Arnold

Hi,

Has anyone done (or know of) any nice R activities that help introductory
students ( and teachers :) ) better understand the concepts of simple vs
stratified vs cluster sampling?

Any links?

David



--
View this message in context: 
http://r.789695.n4.nabble.com/SRS-Stratified-and-Cluster-sampling-tp4655099.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] piece-wise linear regression nls function

2013-01-09 Thread John Sorkin

windows 7, R 2.12
 
I am trying to run a piecewise linear regression with a single knot, i.e. a 
regression composed of two straight lines where the two lines intersect at an x 
value given by the variable knot. I wish to estimate the slope of both lines, 
the value of knot, the x value where the two lines intersect, and an intercept. 
I am using the nls code below, and get the following error message:
Error in nls(FM ~ blow * BMIJS + bhi * sapply(BMIJS - knot, max, 0), start = 
list(knot = 25,  : 
  singular gradient   
nls code:
 
test - 
nls(FM~blow*BMIJS+bhi*sapply(BMIJS-knot,max,0),start=list(knot=25,blow=1,bhi=1),data=FeMaleData)
summary(test)
 
greatly shortened version of my data (the full data set has 450 records)
 
 FMBMIJS
2   55.878 40.57273
4   34.270 27.76939
5   20.123 21.73818
6   19.320 19.71203
9   49.701 43.55356
10  51.188 37.84742
11  46.753 37.71003
13  65.079 37.23438
14  37.097 36.81806
15  30.625 29.92783
17  50.617 42.42754
18  63.954 48.78709
20  29.790 26.97648
21  36.558 34.79373
22  41.275 33.03063
24  27.682 27.24508
26  37.968 35.41399
28  24.878 27.20250
30  47.513 35.77961
31  51.315 37.46032
33  41.944 36.40212
34  38.150 32.83818
35  60.719 42.48594
36  42.643 34.29355
38  40.728 32.42817
42  34.814 30.57573
43  32.896 29.32912
44  30.430 25.44183
46  48.986 37.90910
49  47.485 36.34642
52  46.312 38.64647
54  45.228 33.08783
55  45.391 35.86965
59  37.256 32.66507
60  27.367 28.49880
63  38.663 34.34131
64  34.527 29.57858
67  58.368 38.97266
68  13.473 17.35397
69  22.456 20.80958
71  28.829 25.50056
73  15.487 20.22202
76  18.313 21.38991
77  41.535 36.85707
78  56.124 40.51978
80  52.587 40.77256
81  24.991 25.48543
83  56.327 39.97214
84  70.836 36.52915
85  62.294 42.45244
86  39.689 35.18527
87  35.006 35.15136
88  47.378 37.54779
89  18.149 23.99236
90  33.041 28.10476
91  28.884 26.74443
92  37.670 32.25230
94  55.410 43.72364
99  34.461 35.05930
101 59.727 42.83035
102 41.913 35.64677
104 66.644 41.01642
105 55.250 43.86426
107 45.196 31.78370
108 36.476 33.45537
109 34.386 29.08402
110 39.277 36.98500
111 53.789 45.54654
112 33.077 29.09559
116 57.246 39.98031
120 52.546 40.12191
122 34.409 29.70977
123 31.188 28.75295
126 54.567 38.15226
129 19.193 22.71878
133 39.322 33.45712
134 41.415 31.28980
136 57.616 36.94016
140 28.162 24.40219
142 37.524 29.92673
143 29.611 29.15452
144 26.780 26.53462
146 47.219 35.14919
147 35.341 28.68955
148 44.827 37.68317
149 54.180 41.12226
150 41.636 30.00930
151 33.626 28.00164
156 34.334 29.64970
160 36.317 30.12031
161 46.823 35.64603
163 39.506 34.27740
164 61.619 39.20019
169 48.984 35.77558
171 66.467 41.59008
172 70.144 42.79996
173 37.324 31.56521
174 66.882 46.04938
182 54.239 38.21065
184 48.800 32.01630  Thanks,John 
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information.  
Any unauthorized use, disclosure or distribution is prohibited.  If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiple versions of function

2013-01-09 Thread David Winsemius


On Jan 9, 2013, at 1:00 PM, ivo welch wrote:

 mea culpa.
 
 f - function(...) {
  ## parse out the arguments and then do something with them
 }
 
 ## all of these should result in the same actions
 f(2,3)  ## interprets a to be first and b to be second
 f(a=2,b=3)
 f(b=3,a=2)
 f(data.frame(a=2,b=3))
 f(data.frame(b=3,a=1))
 
In the last two instances you are only passing a single object. I suppose you 
could construct the argument list with

f - function( a=NA, ...) { code}

But this works:

 f - function(a=NA, b=NA) if( !is.list(a) ) {print(a); cat(\n); print(b) } 
else{
  with(a, {print(a); cat(\n); 
print(b)} ) }

There is some concern for using with in functions so maybe you would want 
access values with 

   a[[a]] and a[[b]]

Test output.


 f(2,3)  
[1] 2

[1] 3
 f(a=2,b=3)
[1] 2

[1] 3
 f(b=3,a=2)
[1] 2

[1] 3
 f(data.frame(a=2,b=3))
[1] 2

[1] 3
 f(data.frame(b=3,a=1))
[1] 1

[1] 3

 


 
 On Tue, Jan 8, 2013 at 8:00 AM, David Winsemius dwinsem...@comcast.net 
 wrote:
 
 On Jan 7, 2013, at 6:58 PM, ivo welch wrote:
 
 hi david---can you give just a little more of an example?  the
 function should work with call by order, call by name, and data frame
 whose columns are the names.  /iaw
 
 
 It is I who should be expecting you to provide an example.
 
 -- David.
 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave, Texshop, and sync with included Rnw file

2013-01-09 Thread Duncan Murdoch


On 13-01-09 3:25 PM, michele caseposta wrote:

Hello everyone.
I am in the process of writing a book in Latex with Texshop, on Mac.
This book contains a lot of R code, hence the need to use Sweave.
I was able to compile Rnw files, and to sync back and forth from the pdf to the 
source Rnw.
My problem now is that the book is divided in Chapters, and every chapter is in 
its own Rnw file.
I can compile them from the main one (book.Rnw) using the directive

\SweaveInput{chapter1.Rnw}

The problem stands in the fact that like this I am missing synchronization 
between the pdf and the source Rnw. If part of text is in book.Rnw I can 
synchronize, but if the text is in one of the included files, it just doesn't 
work.
I am using the sweave engine found in the following webpage:

http://cameron.bracken.bz/synctex-with-sweavepgfsweave-in-texshoptexworks

Has anybody succeeded in synchronizing with included Rnw files?


This is a problem addressed by my patchDVI package, available on 
R-forge.  You have a main file (which can be .tex or .Rnw), and put code 
at the start of each .Rnw file to indicate where to find it.  Then you 
just run Sweave on one of the chapters, and it automatically produces 
the full document.


The sample document here:

http://www.umanitoba.ca/statistics/seminars/2011/3/4/duncan-murdoch-using-sweave-R/

includes an appendix describing how to set this up with TeXShop.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] piece-wise linear regression nls function

2013-01-09 Thread David Winsemius


On Jan 9, 2013, at 5:33 PM, John Sorkin wrote:

 windows 7, R 2.12
 
 I am trying to run a piecewise linear regression with a single knot, i.e. a 
 regression composed of two straight lines where the two lines intersect at an 
 x value given by the variable knot. I wish to estimate the slope of both 
 lines, the value of knot, the x value where the two lines intersect, and an 
 intercept. I am using the nls code below, and get the following error message:
 Error in nls(FM ~ blow * BMIJS + bhi * sapply(BMIJS - knot, max, 0), start = 
 list(knot = 25,  : 
  singular gradient   
 nls code:
 
 test - 
 nls(FM~blow*BMIJS+bhi*sapply(BMIJS-knot,max,0),start=list(knot=25,blow=1,bhi=1),data=FeMaleData)
 summary(test)

I was surprised to see `sapply` inside a formula expression. I instead imagined 
that this might have been what was meant:

 test - nls( FM ~ blow*BMIJS + bhi*pmax(BMIJS-knot,0) ,
  start=list(knot=25,blow=1,bhi=1),data=FeMaleData)
 summary(test)

Formula: FM ~ blow * BMIJS + bhi * pmax(BMIJS - knot, 0)

Parameters:
 Estimate Std. Error t value Pr(|t|)
knot  21.4960 3.2095   6.698 1.39e-09 ***
blow   0.8983 0.1264   7.106 2.02e-10 ***
bhi0.9551 0.1610   5.931 4.63e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 5.638 on 97 degrees of freedom

Number of iterations to convergence: 4 
Achieved convergence tolerance: 8.684e-09 

I offer not particular opinion on whether this is sensible, only htat it does 
not break the interpreter's understanding of function application. and the know 
seems within the range of the values, albeit to the left hand edge:

 with(FeMaleData, plot(FM~BMIJS) )
 lines(seq(15, 50), predict(test, newdata=list(BMIJS=seq(15, 50)) ) )


-- 
David.
 
 greatly shortened version of my data (the full data set has 450 records)
 
 FMBMIJS
 2   55.878 40.57273
 4   34.270 27.76939
 5   20.123 21.73818
 6   19.320 19.71203
 9   49.701 43.55356
 10  51.188 37.84742
 11  46.753 37.71003
 13  65.079 37.23438
 14  37.097 36.81806
 15  30.625 29.92783
 17  50.617 42.42754
 18  63.954 48.78709
 20  29.790 26.97648
 21  36.558 34.79373
 22  41.275 33.03063
 24  27.682 27.24508
 26  37.968 35.41399
 28  24.878 27.20250
 30  47.513 35.77961
 31  51.315 37.46032
 33  41.944 36.40212
 34  38.150 32.83818
 35  60.719 42.48594
 36  42.643 34.29355
 38  40.728 32.42817
 42  34.814 30.57573
 43  32.896 29.32912
 44  30.430 25.44183
 46  48.986 37.90910
 49  47.485 36.34642
 52  46.312 38.64647
 54  45.228 33.08783
 55  45.391 35.86965
 59  37.256 32.66507
 60  27.367 28.49880
 63  38.663 34.34131
 64  34.527 29.57858
 67  58.368 38.97266
 68  13.473 17.35397
 69  22.456 20.80958
 71  28.829 25.50056
 73  15.487 20.22202
 76  18.313 21.38991
 77  41.535 36.85707
 78  56.124 40.51978
 80  52.587 40.77256
 81  24.991 25.48543
 83  56.327 39.97214
 84  70.836 36.52915
 85  62.294 42.45244
 86  39.689 35.18527
 87  35.006 35.15136
 88  47.378 37.54779
 89  18.149 23.99236
 90  33.041 28.10476
 91  28.884 26.74443
 92  37.670 32.25230
 94  55.410 43.72364
 99  34.461 35.05930
 101 59.727 42.83035
 102 41.913 35.64677
 104 66.644 41.01642
 105 55.250 43.86426
 107 45.196 31.78370
 108 36.476 33.45537
 109 34.386 29.08402
 110 39.277 36.98500
 111 53.789 45.54654
 112 33.077 29.09559
 116 57.246 39.98031
 120 52.546 40.12191
 122 34.409 29.70977
 123 31.188 28.75295
 126 54.567 38.15226
 129 19.193 22.71878
 133 39.322 33.45712
 134 41.415 31.28980
 136 57.616 36.94016
 140 28.162 24.40219
 142 37.524 29.92673
 143 29.611 29.15452
 144 26.780 26.53462
 146 47.219 35.14919
 147 35.341 28.68955
 148 44.827 37.68317
 149 54.180 41.12226
 150 41.636 30.00930
 151 33.626 28.00164
 156 34.334 29.64970
 160 36.317 30.12031
 161 46.823 35.64603
 163 39.506 34.27740
 164 61.619 39.20019
 169 48.984 35.77558
 171 66.467 41.59008
 172 70.144 42.79996
 173 37.324 31.56521
 174 66.882 46.04938
 182 54.239 38.21065
 184 48.800 32.01630  Thanks,John 
 John David Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)
 Confidentiality Statement:
 This email message, including any attachments, is for ...{{dropped:15}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Random Rectangles

2013-01-09 Thread Jim Lemon


On 01/10/2013 07:37 AM, David Arnold wrote:

Hi,

Just curious. Has anyone out there ever written a script to generate 100
random rectangles such as the ones shown on this page?

http://www2.math.umd.edu/~jlh/214/Random%20Rectangles.pdf



Hi David,
There are a number of ways to generate random rectangles, for instance:

# each row specifies the number of rows and columns of squares
rr.df-data.frame(nrow=sample(1:12,100,TRUE,prob=12:1),
 ncol=sample(1:12,100,TRUE,prob=12:1))

Then just plot the resulting rectangles:

sqrect-function(x0,y0,x1,y1) {
 nx-x1-x0-1
 ny-y1-y0-1
 for(x in 0:nx) {
  for(y in 0:ny)
   rect(x0+x,y0+y,x0+x+1,y0+y+1)
 }
}

rrPlot-function(rrdf,div=1.3) {
 nrect-dim(rrdf)[1]
 plotspace-nrect/div
 plot(c(1,plotspace),c(1,plotspace),type=n,
  axes=FALSE,xlab=,ylab=,main=Random Rectangles)
 xpos-ypos-maxypos-1
 for(rectangle in 1:nrect) {
  if(xpos+rrdf[rectangle,1]  plotspace) {
   xpos-1
   ypos-maxypos
   maxypos-1
  }
  sqrect(xpos,ypos,xpos+rrdf[rectangle,1],
   ypos+rrdf[rectangle,2])
  xpos-xpos+rrdf[rectangle,1]+1
  if(ypos+rrdf[rectangle,2]  maxypos)
   maxypos-ypos+rrdf[rectangle,2]+2
 }
}

The example above does not do any sophisticated placing of the 
rectangles, but more importantly, shows that there are probably unstated 
constraints on the randomness of the rectangles.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] graphical distance matrix

2013-01-09 Thread Jim Lemon


On 01/10/2013 07:50 AM, eliza botto wrote:


Dear R-family,
I made a distance matrix of about 2000 stations. its extremely hard to 
visualize the details of that matrix. I heard that there is a way in R to 
represent the details of distance matrix graphically. more precisely, different 
sections of our distance matrix can be presented in different colors. low 
values be presented in light colors and high values in dark. is there really a 
way of doing it??
thanks in advance
regards
elisa   


Hi elisa,
In the example for the function color.scale.lines (plotrix) you will 
find one method of coloring something (lines in this case) depending 
upon the distance from something else (the starting point). With 
judicious modification, I think it might do what you want.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to count A, C, T, G in each row in a big data.frame?

2013-01-09 Thread Yao He

Hi arun
Then how could spilt them and get a table of letters count such as:
  id AA AG CC CT GA GG GT TC TG TT
  id   A T C G
 #1 27412 81 0 0 25
 #2 27413  0  77 29 0

 Thanks

2013/1/10 arun smartpink...@yahoo.com:
 Hi Yao,
 You could also use:
 library(reshape2)
 dd-dat1[,-(1:4)]
 res-dcast(melt(within(dd,{id=row.names(dd)}),id.var=id),id~value,length)
 head(res)
 # id AA AG CC CT GA GG GT TC TG TT
 #1 27412 29 10  0  0 13  1  0  0  0  0
 #2 27413  0  0  4  9  0  0  0 12  0 28
 #3 27414  0  0  0  0  0  0  0  0  0 53
 #4 27415  0  0 53  0  0  0  0  0  0  0
 #5 27416  0  0  3  9  0  0  0 12  0 29
 #6 27417  0  0 53  0  0  0  0  0  0  0

 #Just for comparison:
 dat2- dat1[rep(row.names(dat1),2000),]
  nrow(dat2)
 #[1] 4
  row.names(dat2)-1:4
  dd - dat2[,-(1:4)]
   system.time(res1- table(rownames(dd)[row(dd)], unlist(dd)))
 #   user  system elapsed
 #  5.840   0.104   5.954
  system.time(res2 - 
 dcast(melt(within(dd,{id=row.names(dd)}),id.var=id),id~value,length))
 #   user  system elapsed
 #  3.100   0.064   3.167
  head(res1,3)

  # AA AG CC CT GA GG GT TC TG TT
  # 1   29 10  0  0 13  1  0  0  0  0
  # 10   0  4  0  0  6 43  0  0  0  0
  # 100 19 15  0  0 15  4  0  0  0  0
  head(res2,3)
 #   id AA AG CC CT GA GG GT TC TG TT
 #1   1 29 10  0  0 13  1  0  0  0  0
 #2  10  0  4  0  0  6 43  0  0  0  0
 #3 100 19 15  0  0 15  4  0  0  0  0

 A.K.







 - Original Message -
 From: Yao He yao.h.1...@gmail.com
 To: R help r-help@r-project.org
 Cc:
 Sent: Wednesday, January 9, 2013 9:23 AM
 Subject: [R] how to count A,C,T,G in each row in a big data.frame?

 Dear All

 I have a data.frame like that:
 structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565,
 Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238,
 Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581,
 Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373,
 Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685,
 Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L,
 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L,
 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L,
 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L,
 3619538L), strand = c(+, +, +, +, +, +, +, +,
 +, +, +, +, +, +, +, +, +, +, +, +),
 X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT,
 CC, GG, AG, AG, AG, TT, CC, AG, CC, AA,
 GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC,
 CC, TT, CC, GG, GG, AG, AG, TT, CC, AG,
 CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC,
 TT, CC, CC, TT, CC, GG, GG, GG, GG, GT,
 CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT,
 TT, CC, TT, CC, CC, TT, CC, GG, AA, AG,
 GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
 GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
 AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
 GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG,
 TC, TT, CC, TC, CC, CC, TT, CC, GG, AA,
 GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
 GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA,
 CC, TT, CC, CC, CC, CC, TT, CC, AG, GG,
 AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
 GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
 GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA,
 TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
 GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA,
 CC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
 GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
 GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
 AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
 AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA,
 TC, TT, CC, TC, CC, CC, TT, CC, GA, GA,
 GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG,
 CT, TT, CC, CT, CC, CC, TT, CC, GG, AG,
 GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA,
 CT, TT, CC, CT, CC, CC, TT, CC, GG, GA,
 GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
 GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA,
 TC, TT, CC, TT, CC, CC, TT, CC, GG, GA,
 GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA,
 TC, TT, CC, TC, CC, CC, TT, CC, GG, GG,
 AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG,
 CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
 GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA,
 CT, TT, CC, CT, CC, CC, TT, CC, GG, GG,
 GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA,
 TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,

Re: [R] how to count A, C, T, G in each row in a big data.frame?

2013-01-09 Thread Jorge I Velez

Here is one option (not the best, but does the job):

foo - function(x) table(factor(unlist(strsplit(as.character(x), )),
levels = c('A','C','G','T')))
t(apply(d[, -c(1:4)], 1, foo))

What's wrong with Jim Holtman's solution?

HTH,
Jorge.-


On Thu, Jan 10, 2013 at 3:46 PM, Yao He  wrote:

 Hi arun
 Then how could spilt them and get a table of letters count such as:
   id AA AG CC CT GA GG GT TC TG TT
   id   A T C G
  #1 27412 81 0 0 25
  #2 27413  0  77 29 0

  Thanks

 2013/1/10 arun smartpink...@yahoo.com:
  Hi Yao,
  You could also use:
  library(reshape2)
  dd-dat1[,-(1:4)]
 
 res-dcast(melt(within(dd,{id=row.names(dd)}),id.var=id),id~value,length)
  head(res)
  # id AA AG CC CT GA GG GT TC TG TT
  #1 27412 29 10  0  0 13  1  0  0  0  0
  #2 27413  0  0  4  9  0  0  0 12  0 28
  #3 27414  0  0  0  0  0  0  0  0  0 53
  #4 27415  0  0 53  0  0  0  0  0  0  0
  #5 27416  0  0  3  9  0  0  0 12  0 29
  #6 27417  0  0 53  0  0  0  0  0  0  0
 
  #Just for comparison:
  dat2- dat1[rep(row.names(dat1),2000),]
   nrow(dat2)
  #[1] 4
   row.names(dat2)-1:4
   dd - dat2[,-(1:4)]
system.time(res1- table(rownames(dd)[row(dd)], unlist(dd)))
  #   user  system elapsed
  #  5.840   0.104   5.954
   system.time(res2 -
 dcast(melt(within(dd,{id=row.names(dd)}),id.var=id),id~value,length))
  #   user  system elapsed
  #  3.100   0.064   3.167
   head(res1,3)
 
   # AA AG CC CT GA GG GT TC TG TT
   # 1   29 10  0  0 13  1  0  0  0  0
   # 10   0  4  0  0  6 43  0  0  0  0
   # 100 19 15  0  0 15  4  0  0  0  0
   head(res2,3)
  #   id AA AG CC CT GA GG GT TC TG TT
  #1   1 29 10  0  0 13  1  0  0  0  0
  #2  10  0  4  0  0  6 43  0  0  0  0
  #3 100 19 15  0  0 15  4  0  0  0  0
 
  A.K.
 
 
 
 
 
 
 
  - Original Message -
  From: Yao He yao.h.1...@gmail.com
  To: R help r-help@r-project.org
  Cc:
  Sent: Wednesday, January 9, 2013 9:23 AM
  Subject: [R] how to count A,C,T,G in each row in a big
 data.frame?
 
  Dear All
 
  I have a data.frame like that:
  structure(list(name = c(Gga_rs10722041, Gga_rs10722249,
 Gga_rs10722565,
  Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238,
  Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581,
  Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373,
  Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685,
  Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
  7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L,
  20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L,
  36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L,
  2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L,
  3619538L), strand = c(+, +, +, +, +, +, +, +,
  +, +, +, +, +, +, +, +, +, +, +, +),
  X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT,
  CC, GG, AG, AG, AG, TT, CC, AG, CC, AA,
  GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC,
  CC, TT, CC, GG, GG, AG, AG, TT, CC, AG,
  CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC,
  TT, CC, CC, TT, CC, GG, GG, GG, GG, GT,
  CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT,
  TT, CC, TT, CC, CC, TT, CC, GG, AA, AG,
  GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 =
 c(GA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 =
 c(AG,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
  GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 =
 c(AG,
  TC, TT, CC, TC, CC, CC, TT, CC, GG, AA,
  GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 =
 c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 =
 c(AA,
  CC, TT, CC, CC, CC, CC, TT, CC, AG, GG,
  AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 =
 c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 =
 c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
  GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 =
 c(GA,
  TC, TT, CC, TC, CC, CC, TT, CC, GG, AG,
  GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 =
 c(GA,
  CC, TT, CC, TC, CC, CC, TT, CC, GG, GA,
  GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 =
 c(AA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, AA,
  GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 =
 c(GA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GA,
  AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 =
 c(GA,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
  AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 =
 c(AA,
  TC, TT, CC, TC, CC, CC, TT, CC, GA, GA,
  GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 =
 c(AG,
  CT, TT, CC, CT, CC, CC, TT, CC, GG, AG,
  GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 =
 c(GA,
  CT, TT, CC, CT, CC, CC, TT, CC, GG, GA,
  GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 =
 c(AG,
  TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
  GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 =
 c(AA,
  TC, TT, CC, TT, CC,

[R] ./R: error while loading shared libraries

2013-01-09 Thread Adam Dahman

Hi,
 
I have installed R on linux using a non root account.
 
I am getting this error when trying to use it :
./R: error while loading shared libraries: libRblas.so: cannot open shared 
object file: No such file or directory

 
Linux version I am using :
Linux version 2.6.32-131.17.1.el6.x86_64 
(mockbu...@x86-007.build.bos.redhat.com) (gcc version 4.4.5 20110214 (Red Hat 
4.4.5-6) (GCC) ) #1 SMP Thu Sep 29 10:24:25 EDT 2011

 
Can someone help ?
 
Regards,
Adam
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Determining sample size from power function

2013-01-09 Thread Kaveh Zakeri

Hello,

I am trying to get the power function to report the sample size rather than the 
power. My goal is to input a variety of values for theta and then for the power 
function to report the corresponding sample sizes. I haven't had much luck 
trying to create my own function, something along the lines of:

f - function (x) {
power(N=z,a=6,f=6,pi=.5,alpha=.1,t0=10,theta=(1/x),CIFev0=.476,CIFcr0=0))=0.8
read(z)
}

In the above example, I am trying to fix the power at 0.80 and solve for z, 
which is the sample size. I would like x to be a random distribution of thetas. 
For instance:

x=rnorm(30,.5,.2)

and then receive the 30 corresponding sample sizes.

Thank you!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiple versions of function

2013-01-09 Thread ivo welch

mea culpa.

f - function(...) {
  ## parse out the arguments and then do something with them
}

## all of these should result in the same actions
f(2,3)  ## interprets a to be first and b to be second
f(a=2,b=3)
f(b=3,a=2)
f(data.frame(a=2,b=3))
f(data.frame(b=3,a=1))



On Tue, Jan 8, 2013 at 8:00 AM, David Winsemius dwinsem...@comcast.net wrote:

 On Jan 7, 2013, at 6:58 PM, ivo welch wrote:

 hi david---can you give just a little more of an example?  the
 function should work with call by order, call by name, and data frame
 whose columns are the names.  /iaw


 It is I who should be expecting you to provide an example.

 -- David.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Interpreting Rasch models

2013-01-09 Thread Lilly Dethier

I'm testing an education assessment (evaluating quantitative skills in
biology students) for reliability. I used cronbach's alpha, but got a
really low alpha which I think is likely due to my small number of
questions (12) and the fact that the questions are of varying difficulty.
After a lot of reading, it looks like Rasch models are probably a more
appropriate tool for my question so I've figured out how to do that
analysis in R using the eRm package. But now I'm having a really hard time
interpreting the output and finding resources to help.

Here are the results of my RM estimation:
Conditional log-likelihood: -394.9651
Number of iterations: 25
Number of parameters: 11

Can anyone help me interpret? I can post further output if you can help,
but didn't want to send a giant email of results!
Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Parameter estimates for each observation (ordered choice)

2013-01-09 Thread Andreas Karpf

I have several demographic variables with which I want to explain the
ordered choice of individuals within a survey in an ordered choice (probit
or logit, this is not important) framework. Standard ordered choice
estimations of course just give me aggregate/average parameter estimates.
For my task it would however be useful to estimate or extract hypothetical
individual-level parameter estimates (betas) for a certain independent
variable and each individual in the survey.

I have experimented with hierarchical Bayes algorithms provided by the
bayesm and ChoiceModelR. Correct me if I am wrong but I think these
techniques also demand that individuals to appear several times within a
survey (thus it should be a panel) and are confronted with different choice
situations, so that one can estimate the influence of certain attributes on
the individuals choices. Anyway ChoiceModelR and bayesm just provide
multinomial choice models while I am seeking for an ordinal probit.

My data however doesn't have any panel structure. I was also experimenting
with Bayesian inference in example by the MCMCoprobit function in the
MCMCpack package, but this function just simulates betas. I can't however,
as far as I know, attribute them to certain individuals in the survey, which
would be good. I would be very glad if somebody could give me a hint,
sometimes already a catchword is helpful to google the correct solution!
Thanks and best regards,

AK

P.S.: the last thing I tried was Compound Hierarchical Ordered Probit
(CHOPIT) because with that I am able to calculate individual cut-off points
which maybe allow be to calculate individual betas. but i didn't try it
exetnsively yet.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Determining sample size from power function

2013-01-09 Thread Jorge I Velez

Dear Kaveh,

Take a look at http://www.statmethods.net/stats/power.html

HTH,
Jorge.-


On Thu, Jan 10, 2013 at 3:21 PM, Kaveh Zakeri  wrote:

 Hello,

 I am trying to get the power function to report the sample size rather
 than the power. My goal is to input a variety of values for theta and then
 for the power function to report the corresponding sample sizes. I haven't
 had much luck trying to create my own function, something along the lines
 of:

 f - function (x) {

 power(N=z,a=6,f=6,pi=.5,alpha=.1,t0=10,theta=(1/x),CIFev0=.476,CIFcr0=0))=0.8
 read(z)
 }

 In the above example, I am trying to fix the power at 0.80 and solve for
 z, which is the sample size. I would like x to be a random distribution of
 thetas. For instance:

 x=rnorm(30,.5,.2)

 and then receive the 30 corresponding sample sizes.

 Thank you!

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

87 matches

Mail list logo