[R] Bootstrapping in R

2013-04-25 Thread Preetam Pal
Hi all,

1i have 3 vectors a,b and c, each of length 25... i want to define a
new data frame z such that z[1] = (a[1] b[1] c[1]), z[2] = (a[2] b[2] c[2])
and so on...how do i do it in R


2 Then i want to draw bootstrap samples from z.

Kindly suggest how i can do this in R.

Thanks,
Preetam
-- 
Preetam Pal
(+91)-9432212774
M-Stat 2nd Year, Room No. N-114
Statistics Division,   C.V.Raman
Hall
Indian Statistical Institute, B.H.O.S.
Kolkata.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Regarding Modeling - Please! QUICK HELP

2013-04-25 Thread Andrew Cochrane
I'm a student currently working with the *sleepstudy* dataset in
matrix.pkg. It deals with the reaction times of sleep deprived students
over a period of days.

I am trying to model reaction times in order to describe the variation
between students by days they havent slept.

This is what I'm running in R, but unfortunately I'm missing something:


 logmod11 - lmer(log(Reaction) ~ (Subject|Days),REML=FALSE)


This is obviously incorrect, so If someone could give me some quick help
I'd really appreciate it.

Thanks!

-- 
J. Andrew Cochrane
University of Illinois | 2013
College of Liberal Arts and Sciences | Statistics
(630) 991-7502

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning a variable value based on multiple columns

2013-04-25 Thread Patrick Coulombe
Hi Jason,

I think that the easiest for you would be to keep your current elseif
statements as is, but change your NA into something else (e.g., -999,
or anything else). To do this in one line, you can use the package
gdata.

In this code, I assume that your data are stored in the variable dataset:


###
#install package gdata if not yet installed
install.packages(gdata)

#load package gdata
library(gdata)

#change NA into -999
dataset - NAToUnknown(dataset, -999)


#do your ifs/ifelses here...
#...
#...


#change -999 back into NA
dataset - unknownToNA(dataset, -999)



And that should do it.

Hope this helps,
Patrick


2013/4/24 Jason Stout, M.D. jason.st...@duke.edu

 Hi All,

 I'm hoping someone can help me with a relatively simple problem.  Take the 
 following dataset:

 IDDiabetesESRDHIVContact
 100NA0
 210NA0
 3NA  100
 40NA  01
 51110

 I want to generate a column called TSTcutoff based on the values in the row.  
 TSTcutoff would be the lower of 15 (if Diabetes=ESRD=HIV=Contact=0), 10 (if 
 Diabetes or ESRD=1 AND HIV=Contact=0), or 5 (if HIV OR Contact=1).  I was 
 thinking this could be done with a series of IFELSE statements, but the NA 
 values make this more challenging.  I want to ignore NA values when 
 calculating TSTcutoff.  So the final dataset should look like this:

 IDDiabetesESRDHIVContact TSTcutoff
 100NA015
 210NA0 10
 3NA  10010
 40NA  015
 511105

 Thanks for any suggestions.

 Jason Stout, MD, MHS
 Box 102359-DUMC
 Durham, NC 27710
 FAX 919-681-7494

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regarding Modeling - Please! QUICK HELP

2013-04-25 Thread Patrick Coulombe
Hi Andrew,

I don't know the dataset at all (and you seem to assume that your
readers will), but anyway: it looks like you're trying to do an
intercept-only model. If that's the case, try:

 logmod11 - lmer(log(Reaction) ~ 1 + (1|Subject),REML=FALSE)

1 is the intercept, and anything in parentheses are your random
effects--in this case, the intercept is random, and your level-2 class
variable is Subject (several lines per Subject).

If you want to add Days as a predictor, try:

 logmod11 - lmer(log(Reaction) ~ 1 + Days + (Days|Subject),REML=FALSE)

Here, both the intercept and the coefficient for Days are random
(allowed to vary for each Subject). Don't forget to include the
dataset after your formula if it's not attached to your environment.

Hope this helps,
Patrick


2013/4/24 Andrew Cochrane jandrew.cochr...@gmail.com:
 I'm a student currently working with the *sleepstudy* dataset in
 matrix.pkg. It deals with the reaction times of sleep deprived students
 over a period of days.

 I am trying to model reaction times in order to describe the variation
 between students by days they havent slept.

 This is what I'm running in R, but unfortunately I'm missing something:


 logmod11 - lmer(log(Reaction) ~ (Subject|Days),REML=FALSE)


 This is obviously incorrect, so If someone could give me some quick help
 I'd really appreciate it.

 Thanks!

 --
 J. Andrew Cochrane
 University of Illinois | 2013
 College of Liberal Arts and Sciences | Statistics
 (630) 991-7502

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regarding Modeling - Please! QUICK HELP

2013-04-25 Thread Rolf Turner


Sorry, this list has a No homework policy.
Please ask your lecturer or tutor about this.

cheers,

Rolf Turner

On 25/04/13 14:18, Andrew Cochrane wrote:

I'm a student currently working with the *sleepstudy* dataset in
matrix.pkg. It deals with the reaction times of sleep deprived students
over a period of days.

I am trying to model reaction times in order to describe the variation
between students by days they havent slept.

This is what I'm running in R, but unfortunately I'm missing something:



logmod11 - lmer(log(Reaction) ~ (Subject|Days),REML=FALSE)


This is obviously incorrect, so If someone could give me some quick help
I'd really appreciate it.

Thanks!



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Missing data

2013-04-25 Thread Roslina Zakaria
Dear r-users,

I would like to investigate about how to fill in missing data.  I started with 
a complete data and try to introduce missing data into the data series.  Then I 
would use some method to fill in the missing data and then compare with the 
original data how good it is.  My question is, how do I introduce missing data 
in my complete data systematically like for example every 10th data will be 
erased and assumed as missing.  Here are some rainfall data:

125
130.3
327.2
252.2
33.8
6.1
5.1
0.5
0.5
0
2.3
0
0
0
0
0
0
0
0
0
0.8
5.1
0
0.3
0
0
0
0
0
0
45.7
43.4
0
0
0
0
0

Thank you so much for any help given.  I hope my question is clear.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Interactive Mode

2013-04-25 Thread Jim Lemon

On 04/24/2013 06:53 PM, Hrachya Astsatryan wrote:

Dear all,

We are doing some research about the time series analysis of NDVI, and
we found the NDVITS package which is a very great tool.
Unfortunately when we run it, after  TimeSeriesAnalysis it asks to enter
Village or Country.

library(ndvits, lib.loc=/home/vahe/R/i686-pc-linux-gnu-library/2.15)
ndvidirectory=paste(system.file(extdata/VITO_Mzimba,
  package=ndvits), /, sep=)
region=Mzimba
Ystart=2004
Yend=2006
shape=SLP_Mzimba
shapedir=paste(system.file(extdata/shape, package=ndvits),
 /, sep=)
outfile = mzimbaTS2.txt
outfile2 = MzimbaTS2.pdf
outfiel3 = my.pdf
signal = TimeSeriesAnalysis(shape, shapedir, ndvidirectory, region,
Ystart, Yend, outfile, outfile2)


How it is possible to call the package by default indicating Village
option /we don't want to enter the parameter and don't want to change
anything in the code which is quite difficult/.


Hi Hrach,
I thought I would have a shot at this, and it has been an education. 
There are a lot of dependencies. I was unable to trace which function 
asks the question Village or Country and it may be hidden. As a guess, 
I would say that this disambiguates names that may refer to both an area 
and a part of that area. Perhaps you could try something like:


region=Mzimba (Village)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tables package - remove NAs and NaN

2013-04-25 Thread Liviu Andronic
On Wed, Apr 24, 2013 at 9:23 PM, Santosh santosh2...@gmail.com wrote:
 Dear Rxperts,
 Sorry if I am posting a really really dumb request.. I am new to subversion
 and am trying to use subversion to download the tables package as suggested
 by Duncan. I installed subversion client(from collabnet) and tried to
 access tables package using the command below.

 svn checkout svn://scm.r-forge.r-project.org/svnroot/tables/

I don't know what's wrong here, but I would suggest that you use an
SVN GUI for Windows (like RapidSVN or TortoiseSVN). This should avoid
space related issues.

Regards,
Liviu


 I get the following error message:
 C:\Users\santosh\tempsvn checkout svn://
 scm.r-forge.r-project.org/svnroot/tables/
 svn: E730060: Unable to connect to a repository at URL
 'svn://scm.r-forge.r-proj ect.org/svnroot/tables'
 svn: E730060: Can't connect to host 'scm.r-forge.r-project.org': A
 connection at tempt failed because the connected party did not properly
 respond after a period  of time, or established connection failed because
 connected host has failed to
 respond.

 Is there anything additional I need to do with Subversion or with the
 commands?


 Regards,
 Santosh

 On Tue, Apr 23, 2013 at 5:13 AM, Duncan Murdoch 
 murdoch.dun...@gmail.comwrote:

 On 13-04-23 6:31 AM, Duncan Murdoch wrote:

 On 13-04-22 10:40 PM, David Winsemius wrote:


 On Apr 22, 2013, at 5:49 PM, Santosh wrote:

  Dear Rxperts,
 q - data.frame(p=rep(c(A,B),**each=10,len=30),
 a=rep(c(1,2,3),each=10),id=**seq(30),
 b=round(runif(30,10,20)),
 c=round(runif(30,40,70)))
 The operation below...
 tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b + c)*
 (mean+sd),data=q)
 yields some rows of NAs and NaN as shown below

   b   c
 p a   N  mean  sdmean  sd
 A 1   10 16.30 2.497 52.30  9.358
 20   NaNNA   NaN NA
 3   10 15.60 2.716 60.30  8.001
 B 10   NaNNA   NaN NA
 2   10 15.40 2.366 57.70 10.414
 30   NaNNA   NaN NA
 All 30 15.77 2.473 56.77  9.601

 How do I remove the rows having N=0 ?
 I would like the resulting table look like..
   b   c
 p a   N  mean  sdmean  sd
 A 1   10 16.30 2.497 52.30  9.358
   3   10 15.60 2.716 60.30  8.001
 B  2   10 15.40 2.366 57.70 10.414
 All 30 15.77 2.473 56.77  9.601


 Here's a bit of a hack:

 tabular( (`p a`=interaction(p,a, drop=TRUE, sep= )) ~ (N = 1) + (b +
 c)*
   (mean+sd),data=q)

   b   c
p a N  mean sd mean sd
A 1 10 12.8 0.7888 52.1 8.020
B 2 10 16.3 3.0569 54.9 8.711
A 3 10 14.6 3.7771 56.5 6.980

 I have been rather hoping that Duncan Murdoch would have noticed the
 earlier thread, but maybe he can comment on whether there is a more direct
 route/


 This isn't something that the package is designed to handle:  if you say
 p*a, it wants all combinations of p and a.

 If I wanted a table like that, I'd use a different hack.  One
 possibility is to create that interaction column, but display it as just
 the initial letter, labelled p, and then add another column to contain
 the a values as data.  It would be tricky to get the formatting right.

 Another possibility is to generate the whole table with the N=0 rows,
 and then post-process it to remove those rows, and adjust the row labels
 appropriately.  This approach probably gives the nicer result, but the
 post-processing is quite messy:  you need to delete some rows from the
 table, from its rowLabels attribute, and from the justification
 attributes of both the table and its rowLabels.  (I should add a [
 method to the package to hide this messiness.)


 I've done this now, in version 0.7.54 on R-forge.  To leave out the rows
 with N=0, you can select a subset of the table where N (the first column)
 is non-zero:

 tab - tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b +
 c)*(mean+sd),data=q)

 tab[ tab[,1]  0, ]

 and it produces this:


  b   c
  p a   N  mean  sdmean sd
  A 1   10 16.20 3.458 56.3 10.155
3   10 13.60 2.119 58.1  8.075
  B 2   10 14.40 2.547 51.2  9.438
All 30 14.73 2.888 55.2  9.419

 Indexing of tables isn't as general as indexing of matrices, but most of
 the simple forms should work.  I haven't tested yet, but I expect this will
 be fine in LaTeX or HTML (also new, not on CRAN yet) output as well.

 Duncan Murdoch


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__

Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’

2013-04-25 Thread Liviu Andronic
Dear Duncan,


On Wed, Apr 24, 2013 at 11:04 PM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
 What I've done sometimes in debugging is to change that error to a
 warning in the getNamespace() function, and add some tracing code to the
 serialization code to print the names of objects as they are loaded.
 (This goes in ReadItem in src/main/serialize.c.)

 I wouldn't expect Liviu to make those changes, but perhaps a verbose
 option could be added to load(), so that it could be available to users.

 I have added this in R-devel.  The format of the printed output may well
 change before this is ever released, but it should be enough to identify the
 bad item already.

 You'll need a build of R-devel from r62658 or newer to see this.  Then

 load(/tmp/a.rda, verbose=TRUE)

 will print the names of objects as they are read (the names are read after
 the attributes and before the value).  If you want to see reams of mostly
 useless information, you can try verbose=n (for some number n=2 or more);
 this prints names and component numbers to a greater depth.

Thank you for adding this in R. I will likely test this feature when
it gets released.

Best regards,
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] installing package

2013-04-25 Thread Gitte Brinch Andersen
Hi

I am trying to install a package (bioconductor) but every time I try to install 
it I get this message:

source(http://bioconductor.org/biocLite.R;)
Warning in install.packages(BiocInstaller, repos = a[BioCsoft, URL]) :
  'lib = C:/Program Files/R/R-3.0.0/library' is not writable
Error in install.packages(BiocInstaller, repos = a[BioCsoft, URL]) : 
  unable to install packages
 biocLite(methylumi)

I normally use mac computers, but I cannot get the right path for the folders I 
should use, so now I am trying with a windows platform instead. But now I 
cannot install one of the packages my pipeline needs.

Can anyone help?

I know it is probably a simple problem, but I have never used R before and 
don't know how to solve problems in it.

Best

Gitte Andersen

E-mail: gitt...@hum-gen.au.dk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] installing package

2013-04-25 Thread Pascal Oettli

Hi,

Do you have administrator rights?

Regards,
Pascal


On 04/25/2013 04:19 PM, Gitte Brinch Andersen wrote:

Hi

I am trying to install a package (bioconductor) but every time I try to install 
it I get this message:

source(http://bioconductor.org/biocLite.R;)
Warning in install.packages(BiocInstaller, repos = a[BioCsoft, URL]) :
   'lib = C:/Program Files/R/R-3.0.0/library' is not writable
Error in install.packages(BiocInstaller, repos = a[BioCsoft, URL]) :
   unable to install packages

 biocLite(methylumi)


I normally use mac computers, but I cannot get the right path for the folders I 
should use, so now I am trying with a windows platform instead. But now I 
cannot install one of the packages my pipeline needs.

Can anyone help?

I know it is probably a simple problem, but I have never used R before and 
don't know how to solve problems in it.

Best

Gitte Andersen

E-mail: gitt...@hum-gen.au.dk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop for main title in a plot

2013-04-25 Thread Blaser Nello
You could use bquote. Something like this:

a-c(1,2,3,4)
b-c(1,2,3,4)
nTrials - length(a)

for (trial in 1:nTrials) { 
  plot(x=a[1:trial], y=b[1:trial],
   ylab=expression(paste(Apple[P])),
   xlab=expression(paste(Banana^th)),
   main=bquote(italic(i-)~.(trial)^th~choice))
 
}

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Eva Günther
Sent: Donnerstag, 25. April 2013 06:22
To: r-help@r-project.org
Subject: [R] Loop for main title in a plot

Hi all,

I have a problem in including my plot in a loop. Here is a simple example for 
one plot:

# Plot simple graph with super- and subscript
a-c(1,2,3,4)
b-c(1,2,3,4)

plot(x=a,y=b,
ylab=expression(paste(Apple[P])),
xlab=expression(paste(Banana^th)),
main=expression(paste(italic(i-)~4^th~choice)))

Now I would like to include the titel (main) as a function of the number of 
trails for (trial in 1:nTrials) { plot(

main=expression(paste(italic(i-)~trial^th~choice)))

}

e.g. nTrials = 5
The title should look like this:

5th plot: i ^th choice
4th plot: i-1 ^th choice
3th plot: i-2 ^th choice and so on

I have problems to create that, could you please help me?

Thank you!!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fdrtool qvalues

2013-04-25 Thread Catherine Tetard
Hi,

I've just started using R and fdrtool, and I'm not sure if the qvalues I'm 
receiving back are accurate. I performed fdrtool on pvalues obtained from a two 
way anova on proteomics data. So I have 266 data values (protein spots) for two 
factors (ft, vr, and the interaction) for each biological sample.

One of the two factors (vr) has a highly significant effect with 119 protein 
spots significantly affected at p0.05. When I run the fdrtool, the qvalues are 
slightly higher than pvalues (as expected), but only up to p0.01. Between 
p0.01 and p0.05, the qvalues are lower, giving me more significant protein 
spots at that level - is this correct?

The other factor (ft) had only 3 weakly significant protein spots. When I run 
fdrtool, all 266 qvalues are 1.

The interaction effect (ftxvr) produced 14 significant pvalues (mostly p0.05, 
a couple are p0.01). fdrtool produces qvalues ranging between 0.87-0.99 and 
the rise with rising pvalues, so I lose the significant results here.

Best wishes,
Catherine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Floating point precision causing undesireable behaviour when printing as.POSIXlt times with microseconds?

2013-04-25 Thread Jim Holtman
FAQ 7.31

also if you are using POSIXct for current dates, the resolution is down to 
about a milliseconds.

Sent from my iPad

On Apr 24, 2013, at 13:57, O'Hanlon, Simon J simon.ohan...@imperial.ac.uk 
wrote:

 Dear list,
 When using as.POSIXlt with times measured down to microseconds the default 
 format.POSIXlt seems to cause some possibly undesirable behaviour:
 
 According to the code in format.POSIXlt the maximum accuracy of printing 
 fractional seconds is 1 microsecond, but if I do;
 
 options( digits.secs = 6 )
 as.POSIXlt( 1.02 , tz=, origin=1970-01-01)
 as.POSIXlt( 1.98 , tz=, origin=1970-01-01)
 as.POSIXlt( 1.99 , tz=, origin=1970-01-01)
 
 I return respectively:
 [1] 1970-01-01 01:00:01.02 BST
 [1] 1970-01-01 01:00:01.98 BST
 [1] 1970-01-01 01:00:01 BST
 
 If options( digits.secs = 6 ) should I not expect to be able to print 
 1.99 seconds? This seems to be caused by the following code fragment in 
 format.POSIXlt:
 
 np - getOption(digits.secs)
if (is.null(np))
np - 0L
else np - min(6L, np)
if (np = 1L)
for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs,
i))  1e-06)) {
np - i
break
}
 
 Specifically
for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs,
i))  1e-06))
 Which in the case of 1.99 seconds will give:
 
 options( scipen = 10 )
 np - 6
 sapply( seq_len(np) - 1L , function(x) abs(1.99 - round(1.99, x)) )
 
 [,1] [,2] [,3] [,4] [,5] [,6]
 [1,] 0.01 0.01 0.01 0.01 0.01 0.01
 
 The logical test all( ...  1e-06)  should evaluate to FALSE but due to 
 floating point precision it evaluates TRUE:
 
 sprintf( %.20f , abs(1. 99  - round(1. 99,5)))
 [1] 0.00991773
 
 If instead of:
 
for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs,
i))  1e-06))
 
 in format.POSIXlt we had a comparison value that was half the minimum 
 increment:
 
for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs,
i))  5e-07))
 
 This behaviour disappears:
 
 mod.format.POSIXlt( as.POSIXlt( 1.99 , tz=, origin=1970-01-01) )
 [1] 1970-01-01 01:00:01.99
 
 But I am unsure if the original behaviour is what I should expect given the 
 documentation (I have read it and I can't see a reason to expect 1.99 to 
 round down to 1). And also if changing the formatting function would have 
 other undesirable consequences?
 
 My sessionInfo():
 R version 3.0.0 (2013-04-03)
 Platform: x86_64-w64-mingw32/x64 (64-bit)
 
 locale:
 [1] LC_COLLATE=English_United Kingdom.1252
 [2] LC_CTYPE=English_United Kingdom.1252
 [3] LC_MONETARY=English_United Kingdom.1252
 [4] LC_NUMERIC=C
 [5] LC_TIME=English_United Kingdom.1252
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base
 
 
 
 Thank you,
 
 
 Simon
 
 
 
 Simon O'Hanlon
 Postgraduate Researcher
 
 Helminth Ecology Research Group
 Department of Infectious Disease Epidemiology
 Imperial College London
 St. Mary's Hospital, Norfolk Place,
 London, W2 1PG, UK
 
 Office: +44 (0) 20 759 43229
 
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bootstrapping in R

2013-04-25 Thread Michael Weylandt


On Apr 25, 2013, at 7:02, Preetam Pal lordpree...@gmail.com wrote:

 Hi all,
 
 1i have 3 vectors a,b and c, each of length 25... i want to define a
 new data frame z such that z[1] = (a[1] b[1] c[1]), z[2] = (a[2] b[2] c[2])
 and so on...how do i do it in R
 

z - data.frame(a, b, c)


 
 2 Then i want to draw bootstrap samples from z.

Look at the boot package. 

MW

 
 Kindly suggest how i can do this in R.
 
 Thanks,
 Preetam
 -- 
 Preetam Pal
 (+91)-9432212774
 M-Stat 2nd Year, Room No. N-114
 Statistics Division,   C.V.Raman
 Hall
 Indian Statistical Institute, B.H.O.S.
 Kolkata.
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Decomposing a List

2013-04-25 Thread Ted Harding
Greetings!
For some reason I am not managing to work out how to do this
(in principle) simple task!

As a result of applying strsplit() to a vector of character strings,
I have a long list L (N elements), where each element is a vector
of two character strings, like:

  L[1] = c(A1,B1)
  L[2] = c(A2,B2)
  L[3] = c(A3,B3)
  [etc.]

From L, I wish to obtain (as directly as possible, e.g. avoiding
a loop) two vectors each of length N where one contains the strings
that are first in the pair, and the other contains the strings
which are second, i.e. from L (as above) I would want to extract:

  V1 = c(A1,A2,A3,...)
  V2 = c(B1,B2,B3,...)

Suggestions?

With thanks,
Ted.

-
E-Mail: (Ted Harding) ted.hard...@wlandres.net
Date: 25-Apr-2013  Time: 11:16:46
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decomposing a List

2013-04-25 Thread Jorge I Velez
Dear Dr. Harding,

Try

sapply(L, [, 1)
sapply(L, [, 2)

HTH,
Jorge.-



On Thu, Apr 25, 2013 at 8:16 PM, Ted Harding ted.hard...@wlandres.netwrote:

 Greetings!
 For some reason I am not managing to work out how to do this
 (in principle) simple task!

 As a result of applying strsplit() to a vector of character strings,
 I have a long list L (N elements), where each element is a vector
 of two character strings, like:

   L[1] = c(A1,B1)
   L[2] = c(A2,B2)
   L[3] = c(A3,B3)
   [etc.]

 From L, I wish to obtain (as directly as possible, e.g. avoiding
 a loop) two vectors each of length N where one contains the strings
 that are first in the pair, and the other contains the strings
 which are second, i.e. from L (as above) I would want to extract:

   V1 = c(A1,A2,A3,...)
   V2 = c(B1,B2,B3,...)

 Suggestions?

 With thanks,
 Ted.

 -
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Date: 25-Apr-2013  Time: 11:16:46
 This message was sent by XFMail

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’

2013-04-25 Thread Duncan Murdoch

On 13-04-25 3:46 AM, Liviu Andronic wrote:

Dear Duncan,


On Wed, Apr 24, 2013 at 11:04 PM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:

What I've done sometimes in debugging is to change that error to a
warning in the getNamespace() function, and add some tracing code to the
serialization code to print the names of objects as they are loaded.
(This goes in ReadItem in src/main/serialize.c.)

I wouldn't expect Liviu to make those changes, but perhaps a verbose
option could be added to load(), so that it could be available to users.


I have added this in R-devel.  The format of the printed output may well
change before this is ever released, but it should be enough to identify the
bad item already.

You'll need a build of R-devel from r62658 or newer to see this.  Then

load(/tmp/a.rda, verbose=TRUE)

will print the names of objects as they are read (the names are read after
the attributes and before the value).  If you want to see reams of mostly
useless information, you can try verbose=n (for some number n=2 or more);
this prints names and component numbers to a greater depth.


Thank you for adding this in R. I will likely test this feature when
it gets released.


That will be about a year from now...

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Missing data

2013-04-25 Thread Rainer Schuermann
I read your data into a dataframe

 x - read.table( clipboard )

and renamed the only column

 colnames( x )[1] - orig

With a loop, I created a 2nd column miss where in every 10th row the 
observation is set to NA:

for( i in 1 : length( x$orig ) )
{
  if( as.integer( rownames( x )[ i ] ) %% 10 == 0 )
  {
x$miss[i] - NA
  } else { 
x$miss[i] - x$orig[i]
  }
}

This is probably the least elegant of all possible solutions but it works...

Rgds,
Rainer




On Wednesday 24 April 2013 23:41:21 Roslina Zakaria wrote:
 Dear r-users,
 
 I would like to investigate about how to fill in missing data.  I started 
 with a complete data and try to introduce missing data into the data series.  
 Then I would use some method to fill in the missing data and then compare 
 with the original data how good it is.  My question is, how do I introduce 
 missing data in my complete data systematically like for example every 10th 
 data will be erased and assumed as missing.  Here are some rainfall data:
 
 125
 130.3
 327.2
 252.2
 33.8
 6.1
 5.1
 0.5
 0.5
 0
 2.3
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0.8
 5.1
 0
 0.3
 0
 0
 0
 0
 0
 0
 45.7
 43.4
 0
 0
 0
 0
 0
 
 Thank you so much for any help given.  I hope my question is clear.
   [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mgcv: how select significant predictor vars when using gam(...select=TRUE) using automatic optimization

2013-04-25 Thread Jan Holstein
Juliet,

for you the diagnostic plots:


just to recall:
the first model was this:

 fit-gam(target
~s(mgs)+s(gsd)+s(mud)+s(ssCmax),family=quasi(link=log),data=wspe1,method=REML,select=F)
 
  summary(fit)   

 Parametric coefficients:
Estimate Std. Error t value Pr(|t|)
 (Intercept)   -4.724  7.462  -0.6330.527
 Approximate significance of smooth terms:
edf Ref.df  F p-value
 s(mgs)3.118  3.492  0.099   0.974
 s(gsd)6.377  7.044 15.596  2e-16 ***
 s(mud)8.837  8.971 18.832  2e-16 ***
 s(ssCmax) 3.886  4.051  2.342   0.052 .  
 ---
 R-sq.(adj) =  0.403   Deviance explained = 40.6%
 REML score =  33186  Scale est. = 8.7812e+05  n = 4511

(I slightly shortened the output)

Also of interest:
Model error as  root mean squared error (RMSE):

  sqrt(mean(residuals.gam(fit,type=response)^2))
 [1] 934.6647

Here are diagnostic plots:

http://r.789695.n4.nabble.com/file/n4665370/screen-capture-1.png 

http://r.789695.n4.nabble.com/file/n4665370/screen-capture-2.png 

Here Simons comment to this particular model from Apr 18, 2013; 5:25pm (see
above)

The p-value computations are based on 
the approximation that things are approximately normal on the linear 
predictor scale, but actually they are no where close to normal in this 
case, which is why the p-values look inconsistent. The reason that the 
approximate normality assumption doesn't hold is that the model is quite 
a poor fit. If you take a look at gam.check(fit) you'll see that the 
constant variance assumption of quasi(link=log) is violated quite badly, 
and the residual distribution is really quite odd (plot residuals 
against fitted as well). Also see plot(fit,pages=1,scale=0) - it shows 
ballooning confidence intervals and smooth estimates that are so low in 
places that they might as well be minus infinity (given log link) - 
clearly something is wrong with this model! 



Following Simons advice (quote):
try Tweedie(p=1.5,link=log) as the family. Also the predictor 
variables are very skewed which is giving leverage problems, so I would 
transform them to give less skew. e.g. Something like 

 fit-gam(target~s(log(mgs))+s(I(gsd^.5))+s(I(mud^.25))+s(log(ssCmax)), 
 + family=Tweedie(p=1.6,link=log),data=wspe1,method=REML)
  summary(fit)

 Parametric coefficients:
Estimate Std. Error t value Pr(|t|)
 (Intercept)  4.026540.05231   76.97   2e-16 ***
 Approximate significance of smooth terms:
 edf Ref.df F p-value
 s(log(mgs))6.067  7.292 12.58  2e-16 ***
 s(I(gsd^0.5))  4.009  5.138 18.25  2e-16 ***
 s(I(mud^0.25)) 7.210  8.240 58.54  2e-16 ***
 s(log(ssCmax)) 8.407  8.764 74.87  2e-16 ***
 R-sq.(adj) =  0.303   Deviance explained =   51%
 REML score =  14355  Scale est. = 27.702n = 4511

(I slightly shortened the output)

RMSE did not improve:
  sqrt(mean(residuals.gam(fit,type=response)^2))
 [1] 1009.268

diagnostic plots in the following

http://r.789695.n4.nabble.com/file/n4665370/screen-capture-3.png 

http://r.789695.n4.nabble.com/file/n4665370/screen-capture-4.png 

wich looks much better. 
The QQ-plot is closer to identity, 
the residuals are more evenly spread and much smaller.
Still, the correlation of response and fitted values seems pretty low

Hope this helps,

Jan






--
View this message in context: 
http://r.789695.n4.nabble.com/mgcv-how-select-significant-predictor-vars-when-using-gam-select-TRUE-using-automatic-optimization-tp4664510p4665370.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How are R version types named ? Any convention (like Hurricanes etc)

2013-04-25 Thread Ajay Ohri
With reference to R News

News:

R version 3.0.0 (Masked Marvel) has been released on 2013-04-03.
R version 2.15.3 (Security Blanket) has been released on 2013-03-01
R version 2.15.2 (Trick or Treat) 
R version 2.15.1 (Roasted Marshmallows) ...
R version 2.15.0 (Easter Beagle)
R version 2.14.0 (Great Pumpkin)

Dear R help List,


How are these version types named? Masked Marvel comes after Security
Blanket comes after Trick or Treat comes after Roasted Marshmallows.
Is it some convention like that for Hurricanes in the West.

It is totally incomprehensible to me as I am in India.


Sincerely,

Ajay Ohri

Author-
R for Business Analytics
http://www.amazon.com/R-Business-Analytics-A-Ohri/dp/1461443423
Founder-
Decisionstats.com
http://decisionstats.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Missing data

2013-04-25 Thread Rui Barradas

Hello,

Something like this?

x - scan(text = 
125
130.3
327.2
252.2
33.8
6.1
5.1
0.5
0.5
0
2.3
0
0
0
0
0
0
0
0
0
0.8
5.1
0
0.3
0
0
0
0
0
0
45.7
43.4
0
0
0
0
0
)

putMissing - function(x, by){
idx - by*seq_along(x)
idx - idx[which(idx = length(x))]
x[idx] - NA
x
}

putMissing(x, 10)
putMissing(x, 5)


Hope this helps,

Rui Barradas

Em 25-04-2013 07:41, Roslina Zakaria escreveu:

Dear r-users,

I would like to investigate about how to fill in missing data.  I started with 
a complete data and try to introduce missing data into the data series.  Then I 
would use some method to fill in the missing data and then compare with the 
original data how good it is.  My question is, how do I introduce missing data 
in my complete data systematically like for example every 10th data will be 
erased and assumed as missing.  Here are some rainfall data:

125
130.3
327.2
252.2
33.8
6.1
5.1
0.5
0.5
0
2.3
0
0
0
0
0
0
0
0
0
0.8
5.1
0
0.3
0
0
0
0
0
0
45.7
43.4
0
0
0
0
0

Thank you so much for any help given.  I hope my question is clear.
[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decomposing a List

2013-04-25 Thread Ted Harding
Thanks, Jorge, that seems to work beautifully!
(Now to try to understand why ... but that's for later).
Ted.

On 25-Apr-2013 10:21:29 Jorge I Velez wrote:
 Dear Dr. Harding,
 
 Try
 
 sapply(L, [, 1)
 sapply(L, [, 2)
 
 HTH,
 Jorge.-
 
 
 
 On Thu, Apr 25, 2013 at 8:16 PM, Ted Harding ted.hard...@wlandres.netwrote:
 
 Greetings!
 For some reason I am not managing to work out how to do this
 (in principle) simple task!

 As a result of applying strsplit() to a vector of character strings,
 I have a long list L (N elements), where each element is a vector
 of two character strings, like:

   L[1] = c(A1,B1)
   L[2] = c(A2,B2)
   L[3] = c(A3,B3)
   [etc.]

 From L, I wish to obtain (as directly as possible, e.g. avoiding
 a loop) two vectors each of length N where one contains the strings
 that are first in the pair, and the other contains the strings
 which are second, i.e. from L (as above) I would want to extract:

   V1 = c(A1,A2,A3,...)
   V2 = c(B1,B2,B3,...)

 Suggestions?

 With thanks,
 Ted.

 -
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Date: 25-Apr-2013  Time: 11:16:46
 This message was sent by XFMail

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-
E-Mail: (Ted Harding) ted.hard...@wlandres.net
Date: 25-Apr-2013  Time: 11:31:57
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stochastic Frontier: Finding the optimal scale/scale efficiency by frontier package

2013-04-25 Thread Arne Henningsen
Dear Miao

On 25 April 2013 03:26, jpm miao miao...@gmail.com wrote:
 I am trying to find out the scale efficiency and optimal scale of banks
 by stochastic frontier analysis given the panel data of bank. I am free to
 choose any model of stochastic frontier analysis.

 The only approach I know to work with R is to estimate a translog
 production function by sfa or other related function in frontier package,
 and then use the Ray 1998 formula to find the scale efficiency. However, as
 the textbook Coelli et al 2005 point out that the concavity may not be
 satisfied,  one needs to impose the nonpositive definiteness condition so
 that the scale efficiency 1.

It might be that the true technology is not concave and that the
elasticity of scale is larger than one. Indeed, most empirical studies
find increasing returns to scale (in many different sectors).
Therefore, it is probably inappropriate to impose concavity.

 How can I do it with frontier package?

The frontier package cannot impose concavity on a Translog production
function and I am not aware of any software that can do this in a
stochastic frontier estimation -- probably, because imposing concavity
usually does not make sense.

 Is there any other SFA model/function in R recommended to find out the
 scale efficiency and optimal scale?

I suggest to plot the elasticity of scale against the firm size. If
the elasticity of scale decreases with firm size, then the most
productive firm size is at the firm size, where the elasticity of
scale is one. However, there are some problems with using the Translog
production function (and the Translog distance function) for
determining the optimal firm size [1].

[1] http://econpapers.repec.org/RePEc:foi:wpaper:2012_12

If you have further questions regarding the frontier package, I
suggest that you use the help forum at frontier's R-Forge site [2].

[2] https://r-forge.r-project.org/projects/frontier/

... and please do not forget to cite the R packages that you use in
your analysis in your publications. Thanks!

Best wishes,
Arne

--
Arne Henningsen
http://www.arne-henningsen.name

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning a variable value based on multiple columns

2013-04-25 Thread Jason Stout, M.D.
Thanks Patrick--I think this solution will work perfectly.

Jason

Jason Stout, MD, MHS
Box 102359-DUMC
Durham, NC 27710
FAX 919-681-7494


From: Patrick Coulombe [patrick.coulo...@gmail.com]
Sent: Thursday, April 25, 2013 1:53 AM
To: Jason Stout, M.D.
Cc: r-help@r-project.org
Subject: Re: [R] Assigning a variable value based on multiple columns

Hi Jason,

I think that the easiest for you would be to keep your current elseif
statements as is, but change your NA into something else (e.g., -999,
or anything else). To do this in one line, you can use the package
gdata.

In this code, I assume that your data are stored in the variable dataset:


###
#install package gdata if not yet installed
install.packages(gdata)

#load package gdata
library(gdata)

#change NA into -999
dataset - NAToUnknown(dataset, -999)


#do your ifs/ifelses here...
#...
#...


#change -999 back into NA
dataset - unknownToNA(dataset, -999)



And that should do it.

Hope this helps,
Patrick


2013/4/24 Jason Stout, M.D. jason.st...@duke.edu

 Hi All,

 I'm hoping someone can help me with a relatively simple problem.  Take the 
 following dataset:

 IDDiabetesESRDHIVContact
 100NA0
 210NA0
 3NA  100
 40NA  01
 51110

 I want to generate a column called TSTcutoff based on the values in the row.  
 TSTcutoff would be the lower of 15 (if Diabetes=ESRD=HIV=Contact=0), 10 (if 
 Diabetes or ESRD=1 AND HIV=Contact=0), or 5 (if HIV OR Contact=1).  I was 
 thinking this could be done with a series of IFELSE statements, but the NA 
 values make this more challenging.  I want to ignore NA values when 
 calculating TSTcutoff.  So the final dataset should look like this:

 IDDiabetesESRDHIVContact TSTcutoff
 100NA015
 210NA0 10
 3NA  10010
 40NA  015
 511105

 Thanks for any suggestions.

 Jason Stout, MD, MHS
 Box 102359-DUMC
 Durham, NC 27710
 FAX 919-681-7494

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How are R version types named ? Any convention (like Hurricanes etc)

2013-04-25 Thread Jim Lemon

On 04/25/2013 07:46 PM, Ajay Ohri wrote:

With reference to R News

News:

R version 3.0.0 (Masked Marvel) has been released on 2013-04-03.
R version 2.15.3 (Security Blanket) has been released on 2013-03-01
R version 2.15.2 (Trick or Treat) 
R version 2.15.1 (Roasted Marshmallows) ...
R version 2.15.0 (Easter Beagle)
R version 2.14.0 (Great Pumpkin)

Dear R help List,


How are these version types named? Masked Marvel comes after Security
Blanket comes after Trick or Treat comes after Roasted Marshmallows.
Is it some convention like that for Hurricanes in the West.

It is totally incomprehensible to me as I am in India.


Hi Ajay,
My guess is that these correspond roughly to the levels of enlightenment 
that can be attained by mortals. We begin with the being having no 
concept of enlightenment. A pumpkin, however great, is as Freud might 
have said, only a pumpkin. A beagle, despite its lowly concept of 
nirvana, which it considers to be a place full of bones and interesting 
things to sniff, has begun to climb that long, long ladder. At first we 
might think of a marshmallow as a step backward on the road to supernal 
knowledge, but if we consider it as the plight of a being impaled upon a 
black birch twig, its essence floating upward with the smoke from the 
campfire, it is easy to see that this spasm of suffering is the prelude 
to its own spiritual ascent. Trick or treat signifies the problem of the 
initiate. So many ways are open, so many promises made. Which way leads 
to the goal? The seeker may cast about for a Security Blanket, some 
apparently firm basis upon which to regain one's bearings. The Masked 
Marvel is the true way, always concealed from all but those who have 
cast off the earthly delights of black box statistical packages and 
devoted their lives to the study of R. In future versions, we will no 
doubt see further occult signs that will lead us in the right direction 
if only we remain true to our noble and transcendent mission.


Okay, I think they are mostly from comic book characters.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How are R version types named ? Any convention (like Hurricanes etc)

2013-04-25 Thread John Kane
I am not sure but it looks suspiciously like a set of references to the 
comicstrip  Peanuts by Charlie Shultz. http://en.wikipedia.org/wiki/Peanuts


John Kane
Kingston ON Canada


 -Original Message-
 From: ohri2...@gmail.com
 Sent: Thu, 25 Apr 2013 15:16:17 +0530
 To: r-help@r-project.org
 Subject: [R] How are R version types named ? Any convention (like
 Hurricanes etc)
 
 With reference to R News
 
 News:
 
 R version 3.0.0 (Masked Marvel) has been released on 2013-04-03.
 R version 2.15.3 (Security Blanket) has been released on 2013-03-01
 R version 2.15.2 (Trick or Treat) 
 R version 2.15.1 (Roasted Marshmallows) ...
 R version 2.15.0 (Easter Beagle)
 R version 2.14.0 (Great Pumpkin)
 
 Dear R help List,
 
 
 How are these version types named? Masked Marvel comes after Security
 Blanket comes after Trick or Treat comes after Roasted Marshmallows.
 Is it some convention like that for Hurricanes in the West.
 
 It is totally incomprehensible to me as I am in India.
 
 
 Sincerely,
 
 Ajay Ohri
 
 Author-
 R for Business Analytics
 http://www.amazon.com/R-Business-Analytics-A-Ohri/dp/1461443423
 Founder-
 Decisionstats.com
 http://decisionstats.com
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


GET FREE SMILEYS FOR YOUR IM  EMAIL - Learn more at 
http://www.inbox.com/smileys
Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most 
webmails

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression and FMMs with flexmix

2013-04-25 Thread Ingmar Visser
Robin,

On Wed, Apr 24, 2013 at 11:24 AM, Robin Tviet robintv...@outlook.comwrote:


 I am trying to understand how to use the flexmix package, I have read the
 Leisch paper but am very unclear what is needed for the M-step driver.  I
 am just fitting a simple linear regression model.  The documentation is far
 from clear what the FLXmclust function does, but, it in principle could do
 all I need, however, I do not get sentible results as if I try the
 following the result is poor:

 x-c()
 for(i in
 0:99){x$y[2*i]=(0+i);x$x[2*i]=i;x$x[2*i+1]=i;x$y[2*i+1]=i+1000;x$g[2*i]=1;x$g[2*i+1]=2}
 m1-flexmix(y~x  ,data=x,k=2)
 table(x$g,m1@cluster)

 1  2
 1 25 74
 2 67 33


there is no correlation between x and y, nor within groups, nor between
groups so not sure why your model would make sense; the following model
runs just (although it also depends on starting values whether the result
is the 2 expected clusters or 1 large cluster of all the data):

 set.seed(1)
 m1-flexmix(y~1  ,data=x,k=2)
 m1

Call:
flexmix(formula = y ~ 1, data = x, k = 2)

Cluster sizes:
  1   2
 99 100

convergence after 2 iterations

hth, Ingmar




 It all depends on the randomised starting values.  So I think I need a
 better driver, but, I cannot find a spec for what I have to do in the
 driver.

 Where is FLXmclust documented?  can anyone assist?



 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Selecting and then joining data blocks

2013-04-25 Thread Preetam Pal
Hi all,

I have 4 matrices, each having  5 columns and 4 rows .denoted by
B1,B2,B3,B4.
I have generated a vector of 7 indices, say (1,2,4,3,2,3,1} which refers to
the index of the matrices to be chosen and then appended one on the top of
the next: like, in this case, I wish to have the following mega matrix:
B1over B2 over B4 over B3 over B2 over B3 over B1.

1 How can I achieve this?
2 I don't want to manually identify and arrange the matrices for each
vector of index values generated   (for which the code  I used is :
index=sample( 4,7,replace=T)). How can I automate the process?

Basically, I am doing bootstrapping , but the observations are actually 4X5
matrices.

Appreciate your help.


Thanks,
Preetam


---

Preetam Pal
(+91)-9432212774
M-Stat 2nd Year, Room No. N-114
Statistics Division,   C.V.Raman
Hall
Indian Statistical Institute, B.H.O.S.
Kolkata.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting and then joining data blocks

2013-04-25 Thread arun
HI,
set.seed(24)
#creating the four matrix in a list

lst1-lapply(1:4,function(x) matrix(sample(1:40,20,replace=TRUE),ncol=5))
names(lst1)- paste0(B,1:4)
vec- c(1,2,4,3,2,3,1)
res-do.call(rbind,lapply(vec,function(i) lst1[[i]]))
dim(res)
#[1] 28  5


#or
B1- lst1[[1]]
 B2- lst1[[2]]
 B3- lst1[[3]]
 B4- lst1[[4]]

 res2-do.call(rbind,lapply(vec,function(i) get(paste0(B,i
 identical(res,res2)
#[1] TRUE
A.K.




- Original Message -
From: Preetam Pal lordpree...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Thursday, April 25, 2013 7:51 AM
Subject: [R] Selecting and then joining data blocks

Hi all,

I have 4 matrices, each having  5 columns and 4 rows .denoted by
B1,B2,B3,B4.
I have generated a vector of 7 indices, say (1,2,4,3,2,3,1} which refers to
the index of the matrices to be chosen and then appended one on the top of
the next: like, in this case, I wish to have the following mega matrix:
B1over B2 over B4 over B3 over B2 over B3 over B1.

1 How can I achieve this?
2 I don't want to manually identify and arrange the matrices for each
vector of index values generated   (for which the code  I used is :
index=sample( 4,7,replace=T)). How can I automate the process?

Basically, I am doing bootstrapping , but the observations are actually 4X5
matrices.

Appreciate your help.


Thanks,
Preetam


---

Preetam Pal
(+91)-9432212774
M-Stat 2nd Year,                                             Room No. N-114
Statistics Division,                                           C.V.Raman
Hall
Indian Statistical Institute,                                 B.H.O.S.
Kolkata.

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decomposing a List

2013-04-25 Thread arun
Hi,
May be this helps.

L- list(c(A1,B1),c(A2,B2),c(A3,B3))
simplify2array(L)[1,]
#[1] A1 A2 A3
simplify2array(L)[2,]
#[1] B1 B2 B3


#or
library(stringr)
 word(sapply(L,paste,collapse= ),1)
#[1] A1 A2 A3
A.K.



- Original Message -
From: ted.hard...@wlandres.net ted.hard...@wlandres.net
To: r-help@r-project.org
Cc: 
Sent: Thursday, April 25, 2013 6:16 AM
Subject: [R] Decomposing a List

Greetings!
For some reason I am not managing to work out how to do this
(in principle) simple task!

As a result of applying strsplit() to a vector of character strings,
I have a long list L (N elements), where each element is a vector
of two character strings, like:

  L[1] = c(A1,B1)
  L[2] = c(A2,B2)
  L[3] = c(A3,B3)
  [etc.]

From L, I wish to obtain (as directly as possible, e.g. avoiding
a loop) two vectors each of length N where one contains the strings
that are first in the pair, and the other contains the strings
which are second, i.e. from L (as above) I would want to extract:

  V1 = c(A1,A2,A3,...)
  V2 = c(B1,B2,B3,...)

Suggestions?

With thanks,
Ted.

-
E-Mail: (Ted Harding) ted.hard...@wlandres.net
Date: 25-Apr-2013  Time: 11:16:46
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] installing package

2013-04-25 Thread Martin Morgan

On 04/25/2013 12:19 AM, Gitte Brinch Andersen wrote:

Hi

I am trying to install a package (bioconductor) but every time I try to install 
it I get this message:

source(http://bioconductor.org/biocLite.R;)
Warning in install.packages(BiocInstaller, repos = a[BioCsoft, URL]) :
   'lib = C:/Program Files/R/R-3.0.0/library' is not writable
Error in install.packages(BiocInstaller, repos = a[BioCsoft, URL]) :
   unable to install packages


Hi Gitte -- this and your Mac path problems are really a question for the 
Bioconductor mailing list


  http://bioconductor.org/help/mailing-list/

I don't know the answer to your path problem, but the package author monitors 
that list and will be able to help.


I would have expected the attempt run the biocLite.R script to result in a 
dialog that asks 'Would you like to use a personal library instead?', to which 
you should answer 'yes'.


If for some reason you do not want to answer 'yes', then read the help page 
?.libPaths


Hope that helps, and please ask your questions about Bioconductor packages on 
the Bioconductor mailing list.


Martin



I normally use mac computers, but I cannot get the right path for the folders I 
should use, so now I am trying with a windows platform instead. But now I 
cannot install one of the packages my pipeline needs.

Can anyone help?

I know it is probably a simple problem, but I have never used R before and 
don't know how to solve problems in it.

Best

Gitte Andersen

E-mail: gitt...@hum-gen.au.dk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Make R 3.0 open .RData files

2013-04-25 Thread Dimitri Liakhovitski
Hello!

I have Windows 7 Enterprise and two versions of R installed: 2.15.3 and
3.0.0.
Before I had R 3.0 I made it a setting that all .RData files - when I
double-click on them - were opened by R 2.15.3.
Now I want them to be opened by R 3.0 instead of R 2.15.3 (but I don't want
to remove R 2.15.3. yet).

I right-click on some .RData file, select Open with - Choose default
program and then click on Browse.

I browse to the folder where my R 3.0 is installed, then to the folder
bin, then to the folder x64 and select Rgui.exe.
However, when R opens - or after I shut R down and then double-click on
some .RData file and R opens, it is again R 2.15.3, not R3.0.

What am I doing wrong?

Of course, when I open R 3.0 directly, then it opens no problem.

Thank you!

-- 
Dimitri Liakhovitski

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear Interpolation : Missing rates

2013-04-25 Thread Adams, Jean
Katherine,

Split the rate names into their currency and tenor parts and assign a
numeric value to each tenor.  Choose a model to do your approximations (I
used linear regression in the example below).  Use this model to generate
estimates for all combinations of currency and tenor.

For example:

# split the rate names into currency and tenor
splitnames - do.call(rbind, strsplit(df$rate_name, _))
df$currency - as.factor(splitnames[, 1])
df$tenor - splitnames[, 2]

# assign numeric value to each tenor
uniquetenors - c(1w, 2w, 1m, 2m)
uniquedays - c(7, 14, 30.5, 61)
df$tenordays - uniquedays[match(df$tenor, uniquetenors)]

# fit a linear model of rate on tenordays for each currency
fit - lm(rates ~ currency*tenordays, data=df)

# estimate rates for all combinations of currency and tenor
fulldf - expand.grid(tenordays=unique(df$tenordays),
currency=unique(df$currency))
fulldf$est.rates = predict(fit, newdata=fulldf)

# merge observed rates with estimated rates
dfwithest - merge(df, fulldf, all=TRUE)

Jean


On Thu, Apr 25, 2013 at 12:33 AM, Katherine Gobin katherine_go...@yahoo.com
 wrote:

 Dear R forum

 I have data.frame as

 df = data.frame(rate_name = c(USD_1w, USD_1w, USD_1w, USD_1w,
 USD_1m, USD_1m, USD_1m, USD_1m, USD_2m, USD_2m, USD_2m,
 USD_2m,  GBP_1w, GBP_1w, GBP_1w, GBP_1w, GBP_1m, GBP_1m,
 GBP_1m, GBP_1m, GBP_2m, GBP_2m, GBP_2m, GBP_2m, EURO_1w,
 EURO_1w, EURO_1w, EURO_1w, EURO_2w, EURO_2w, EURO_2w,
 EURO_2w, EURO_2m, EURO_2m, EURO_2m, EURO_2m), rates = c(2.05,
 2.07, 2.06, 2.06, 2.22, 2.24, 2.23, 2.23, 2.31, 2.33, 2.33, 2.31, 1.06,
 1.08, 1.08, 1.08, 1.21, 1.21, 1.23, 1.21, 1.41, 1.39, 1.39, 1.37, 1.82,
 1.82, 1.81, 1.80, 1.98, 1.98, 1.97, 1.97, 2.1, 2.09, 2.09, 2.11))

 currency = c(EURO, GBP, USD)
 tenor = c(1w, 2w, 1m, 2m, 3m)

 # _

  df
rate_name rates
rate_name rates
 1 USD_1w  2.05
 2 USD_1w  2.07
 3 USD_1w  2.06
 4 USD_1w  2.06
 5 USD_1m  2.22
 6 USD_1m  2.24
 7 USD_1m  2.23
 8 USD_1m  2.23
 9 USD_2m  2.31
 10USD_2m  2.33
 11USD_2m  2.33
 12USD_2m  2.31
 13GBP_1w  1.06
 14GBP_1w  1.08
 15GBP_1w  1.08
 16GBP_1w  1.08
 17GBP_1m  1.21
 18GBP_1m  1.21
 19GBP_1m  1.23
 20GBP_1m  1.21
 21GBP_2m  1.41
 22GBP_2m  1.39
 23GBP_2m  1.39
 24GBP_2m  1.37
 25   EURO_1w  1.82
 26   EURO_1w  1.82
 27   EURO_1w  1.81
 28   EURO_1w  1.80
 29   EURO_2w  1.98
 30   EURO_2w  1.98
 31   EURO_2w  1.97
 32   EURO_2w  1.97
 33   EURO_2m  2.10
 34   EURO_2m  2.09
 35   EURO_2m  2.09
 36   EURO_2m  2.11

 As can be seen that USD_2w, GBP_2w and EURO_1m are missing and I need to
 INTERPOLATE these rates, which can be done using approx or approxfun. In
 reality I can have many currencies with many tenors. Problem is when the
 data.frame df is read or accessed in R, I am not aware which tenor is
 missing. For a given currency, it is possible that mare than 1 consecutive
 tenors may be missing e.g. in case of EURO, I may have EURO_1w, EURO_2w and
 then EURO_4m. So EURO_1m, EURO_2m and EURO_3m are missing.


 I understand it's sort of vague question from me and do apologize for the
 same. Any suggestion please.

 Regards

 Katherine





 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bootstrapping in R

2013-04-25 Thread David Carlson
First you should read some introductory manuals on R. There are many to
choose from at

http://cran.r-project.org/other-docs.html

For example, your first question is very simple:

z - data.frame(a, b, c)

To draw a single random sample (with replacement) from z:

z1 - z[sample(1:nrow(z), nrow(z), replace=TRUE),]

-
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Michael Weylandt
Sent: Thursday, April 25, 2013 4:36 AM
To: Preetam Pal
Cc: r-help@r-project.org
Subject: Re: [R] Bootstrapping in R



On Apr 25, 2013, at 7:02, Preetam Pal lordpree...@gmail.com wrote:

 Hi all,
 
 1i have 3 vectors a,b and c, each of length 25... i want to 
 1define a
 new data frame z such that z[1] = (a[1] b[1] c[1]), z[2] = (a[2] b[2] 
 c[2]) and so on...how do i do it in R
 

z - data.frame(a, b, c)


 
 2 Then i want to draw bootstrap samples from z.

Look at the boot package. 

MW

 
 Kindly suggest how i can do this in R.
 
 Thanks,
 Preetam
 --
 Preetam Pal
 (+91)-9432212774
 M-Stat 2nd Year, Room No.
N-114
 Statistics Division,   C.V.Raman
 Hall
 Indian Statistical Institute, B.H.O.S.
 Kolkata.
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Missing data

2013-04-25 Thread David Carlson
Another approach:

x[1:length(x) %% 10 == 0] - NA

Just replace 10 by the interval you want. Or to add 5 missing values
randomly:

x[sample(1:length(x), 5)] -NA

-
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Rainer Schuermann
Sent: Thursday, April 25, 2013 5:45 AM
To: r-help@r-project.org
Cc: Roslina Zakaria
Subject: Re: [R] Missing data

I read your data into a dataframe

 x - read.table( clipboard )

and renamed the only column

 colnames( x )[1] - orig

With a loop, I created a 2nd column miss where in every 10th row the
observation is set to NA:

for( i in 1 : length( x$orig ) )
{
  if( as.integer( rownames( x )[ i ] ) %% 10 == 0 )
  {
x$miss[i] - NA
  } else { 
x$miss[i] - x$orig[i]
  }
}

This is probably the least elegant of all possible solutions but it works...

Rgds,
Rainer




On Wednesday 24 April 2013 23:41:21 Roslina Zakaria wrote:
 Dear r-users,
 
 I would like to investigate about how to fill in missing data.  I started
with a complete data and try to introduce missing data into the data series.
Then I would use some method to fill in the missing data and then compare
with the original data how good it is.  My question is, how do I introduce
missing data in my complete data systematically like for example every 10th
data will be erased and assumed as missing.  Here are some rainfall data:
 
 125
 130.3
 327.2
 252.2
 33.8
 6.1
 5.1
 0.5
 0.5
 0
 2.3
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0.8
 5.1
 0
 0.3
 0
 0
 0
 0
 0
 0
 45.7
 43.4
 0
 0
 0
 0
 0
 
 Thank you so much for any help given.  I hope my question is clear.
   [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Make R 3.0 open .RData files

2013-04-25 Thread Duncan Murdoch

On 13-04-25 8:33 AM, Dimitri Liakhovitski wrote:

Hello!

I have Windows 7 Enterprise and two versions of R installed: 2.15.3 and
3.0.0.
Before I had R 3.0 I made it a setting that all .RData files - when I
double-click on them - were opened by R 2.15.3.
Now I want them to be opened by R 3.0 instead of R 2.15.3 (but I don't want
to remove R 2.15.3. yet).

I right-click on some .RData file, select Open with - Choose default
program and then click on Browse.

I browse to the folder where my R 3.0 is installed, then to the folder
bin, then to the folder x64 and select Rgui.exe.
However, when R opens - or after I shut R down and then double-click on
some .RData file and R opens, it is again R 2.15.3, not R3.0.

What am I doing wrong?

Of course, when I open R 3.0 directly, then it opens no problem.


This is really a question about Windows 7, not about R, but I would 
guess you aren't telling it to make your choice permanent, or perhaps 
you are not allowed by your administrator to make permanent changes to 
file associations.  You should ask for local help.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trouble Computing Type III SS in a Cox Regression

2013-04-25 Thread Paul Miller
Hi Dr. Therneau,

Thanks for your reply to my question. I'm aware that many on the list do not 
like type III SS. I'm not particularly attached to the idea of using them but 
often produce output for others who see value in type III SS. 

You mention the problems with type III SS when testing interactions. I don't 
think we'll be doing that here though. So my type III SS could just as easily 
be called type II SS I think. If the SS I'm calculating are essentially type II 
SS, is that still problematic for a Cox model?

People using type III SS generally want a measure of whether or not a variable 
is contributing something to their model or if it could just as easily be 
discarded. Is there a better way of addressing this question than by using type 
III (or perhaps type II) SS?

A series of model comparisons using a LRT might be the answer. If it is, is 
there an efficient way of implementing this approach when there are many 
predictors? Another approach might be to run models through step or stepAIC in 
order to determine which predictors are useful and to discard the rest. Is that 
likely to be any good?

Thanks,

Paul

--- On Wed, 4/24/13, Terry Therneau thern...@mayo.edu wrote:

 From: Terry Therneau thern...@mayo.edu
 Subject: Re:  Trouble Computing Type III SS in a Cox Regression
 To: r-help@r-project.org, Paul Miller pjmiller...@yahoo.com
 Received: Wednesday, April 24, 2013, 5:55 PM
 I should hope that there is trouble,
 since type III is an undefined concept for a Cox
 model.  Since SAS Inc fostered the cult of type III
 they have recently added it as an option for phreg, but I am
 not able to find any hints in the phreg documentation of
 what exactly they are doing when you invoke it.  If you
 can unearth this information, then I will be happy to tell
 you whether
    a. using the test (whatever it is) makes
 any sense at all for your data set
    b. if a is true, how to get it out of R
 
 I use the word cult on purpose -- an entire generation of
 users who believe in the efficacy of this incantation
 without having any idea what it actually does.  In many
 particular instances the SAS type III corresponds to a
 survey sampling question, i.e., reweight the data so that it
 is balanced wrt factor A and then test factor B in the new
 sample.  The three biggest problems with type III are
 that
 1: the particular test has been hyped as better when in
 fact it sometimes is sensible and sometimes not, 2: SAS
 implemented it as a computational algorithm which
 unfortunately often works even when the underlying rationale
 does not hold and
 3: they explain it using a notation that completely obscures
 the actual question.  This last leads to the nonsense
 phrase test for main effects in the presence of
 interactions.
 
 There is a survey reweighted approach for Cox models, very
 closely related to the work on causal inference (marginal
 structural models), but I'd bet dollars to donuts that this
 is not what SAS is doing.
 
 (Per 2 -- type III was a particular order of operations of
 the sweep algorithm for linear models, and for backwards
 compatability that remains the core definition even as
 computational algorthims have left sweep behind.  But
 Cox models can't be computed using the sweep algorithm).
 
 Terry Therneau
 
 
 On 04/24/2013 12:41 PM, r-help-requ...@r-project.org
 wrote:
  Hello All,
  
  Am having some trouble computing Type III SS in a Cox
 Regression using either drop1 or Anova from the car package.
 Am hoping that people will take a look to see if they can
 tell what's going on.
  
  Here is my R code:
  
  cox3grp- subset(survData,
  Treatment %in% c(DC, DA, DO),
  c(PTNO, Treatment, PFS_CENSORED, PFS_MONTHS,
 AGE, PS2))
  cox3grp- droplevels(cox3grp)
  str(cox3grp)
  
  coxCV- coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~
 AGE + PS2, data=cox3grp, method = efron)
  coxCV
  
  drop1(coxCV, test=Chisq)
  
  require(car)
  Anova(coxCV, type=III)
  
  And here are my results:
  
  cox3grp- subset(survData,
  +             
      Treatment %in% c(DC, DA,
 DO),
  +             
      c(PTNO, Treatment,
 PFS_CENSORED, PFS_MONTHS, AGE, PS2))
    cox3grp- droplevels(cox3grp)
    str(cox3grp)
  'data.frame':    227 obs. of  6
 variables:
    $ PTNO        :
 int  1195997 104625 106646 1277507 220506 525343 789119
 817160 824224 82632 ...
    $ Treatment   : Factor
 w/ 3 levels DC,DA,DO: 1 1 1 1 1 1 1 1 1 1 ...
    $ PFS_CENSORED: int  1 1 1 0 1 1
 1 1 0 1 ...
    $ PFS_MONTHS  : num  1.12
 8.16 6.08 1.35 9.54 ...
    $ AGE     
    : num  72 71 80 65 72 60 63 61 71 70
 ...
    $ PS2     
    : Ord.factor w/ 2 levels YesNo: 2
 2 2 2 2 2 2 2 2 2 ...
      coxCV-
 coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2,
 data=cox3grp, method = efron)
    coxCV
  Call:
  coxph(formula = Surv(PFS_MONTHS, PFS_CENSORED == 1) ~
 AGE + PS2,
       data = cox3grp, method = efron)
  
  
             coef exp(coef)
 se(coef)      z     p
  AGE    0.00492 
    1.005  0.00789  0.624 0.530
  PS2.L -0.34523  

Re: [R] Make R 3.0 open .RData files

2013-04-25 Thread Dimitri Liakhovitski
Weird - because I was successful in doing it as I was installing earlier R
versions and moved from an earlier version to a newer version. Never had
any problems with making permanent changes to file associations in any
other programs either.


On Thu, Apr 25, 2013 at 9:00 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote:

 On 13-04-25 8:33 AM, Dimitri Liakhovitski wrote:

 Hello!

 I have Windows 7 Enterprise and two versions of R installed: 2.15.3 and
 3.0.0.
 Before I had R 3.0 I made it a setting that all .RData files - when I
 double-click on them - were opened by R 2.15.3.
 Now I want them to be opened by R 3.0 instead of R 2.15.3 (but I don't
 want
 to remove R 2.15.3. yet).

 I right-click on some .RData file, select Open with - Choose default
 program and then click on Browse.

 I browse to the folder where my R 3.0 is installed, then to the folder
 bin, then to the folder x64 and select Rgui.exe.
 However, when R opens - or after I shut R down and then double-click on
 some .RData file and R opens, it is again R 2.15.3, not R3.0.

 What am I doing wrong?

 Of course, when I open R 3.0 directly, then it opens no problem.


 This is really a question about Windows 7, not about R, but I would guess
 you aren't telling it to make your choice permanent, or perhaps you are not
 allowed by your administrator to make permanent changes to file
 associations.  You should ask for local help.

 Duncan Murdoch





-- 
Dimitri Liakhovitski

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Weighted Principle Components analysis

2013-04-25 Thread Dimitri Liakhovitski
Hello!

I am doing Principle Componenets Analysis using psych package:

mypc-principal(mydata,5,scores=TRUE)

However, I was asked to run a case-weighted PCA - using an individual
weight for each case.

I could use corr from boot package to calculate the case-weighed
intercorrelation matrix. But if I use the intercorrelation matrix as input
(instead of the raw data), I am not going to get factor scores, which I do
need to get.

Any advice?
Thank you very much!

-- 
Dimitri Liakhovitski

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Make R 3.0 open .RData files

2013-04-25 Thread Prof Brian Ripley

On 25/04/2013 14:00, Duncan Murdoch wrote:

On 13-04-25 8:33 AM, Dimitri Liakhovitski wrote:

Hello!

I have Windows 7 Enterprise and two versions of R installed: 2.15.3 and
3.0.0.
Before I had R 3.0 I made it a setting that all .RData files - when I
double-click on them - were opened by R 2.15.3.
Now I want them to be opened by R 3.0 instead of R 2.15.3 (but I don't
want
to remove R 2.15.3. yet).

I right-click on some .RData file, select Open with - Choose default
program and then click on Browse.

I browse to the folder where my R 3.0 is installed, then to the folder
bin, then to the folder x64 and select Rgui.exe.
However, when R opens - or after I shut R down and then double-click on
some .RData file and R opens, it is again R 2.15.3, not R3.0.

What am I doing wrong?

Of course, when I open R 3.0 directly, then it opens no problem.


This is really a question about Windows 7, not about R, but I would
guess you aren't telling it to make your choice permanent, or perhaps
you are not allowed by your administrator to make permanent changes to
file associations.  You should ask for local help.


We've encountered this for our student accounts, and think it is a bug 
in Windows 7.  If you remove the relevant old Registry entries first it 
should work.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear Interpolation : Missing rates

2013-04-25 Thread Katherine Gobin
Dear Mr Adams,

Thanks a lot for your solution. I understand it was very tricky and needed lot 
of application. Thanks again and do appreciate your efforts.

Regards

Katherine

--- On Thu, 25/4/13, Adams, Jean jvad...@usgs.gov wrote:

From: Adams, Jean jvad...@usgs.gov
Subject: Re: [R] Linear Interpolation : Missing rates
To: Katherine Gobin katherine_go...@yahoo.com
Cc: R help r-help@r-project.org
Date: Thursday, 25 April, 2013, 2:23 PM

Katherine,
Split the rate names into their currency and tenor parts and assign a numeric 
value to each tenor.  Choose a model to do your approximations (I used linear 
regression in the example below).  Use this model to generate estimates for all 
combinations of currency and tenor.


For example:
# split the rate names into currency and tenorsplitnames - do.call(rbind, 
strsplit(df$rate_name, _))df$currency - as.factor(splitnames[, 1])

df$tenor - splitnames[, 2]
# assign numeric value to each tenoruniquetenors - c(1w, 2w, 1m, 
2m)uniquedays - c(7, 14, 30.5, 61)

df$tenordays - uniquedays[match(df$tenor, uniquetenors)]
# fit a linear model of rate on tenordays for each currencyfit - lm(rates ~ 
currency*tenordays, data=df)


# estimate rates for all combinations of currency and tenorfulldf - 
expand.grid(tenordays=unique(df$tenordays), 
currency=unique(df$currency))fulldf$est.rates = predict(fit, newdata=fulldf)


# merge observed rates with estimated ratesdfwithest - merge(df, fulldf, 
all=TRUE)
Jean



On Thu, Apr 25, 2013 at 12:33 AM, Katherine Gobin katherine_go...@yahoo.com 
wrote:


Dear R forum



I have data.frame as



df = data.frame(rate_name = c(USD_1w, USD_1w, USD_1w, USD_1w, USD_1m, 
USD_1m, USD_1m, USD_1m, USD_2m, USD_2m, USD_2m, USD_2m,  
GBP_1w, GBP_1w, GBP_1w, GBP_1w, GBP_1m, GBP_1m, GBP_1m, GBP_1m, 
GBP_2m, GBP_2m, GBP_2m, GBP_2m, EURO_1w, EURO_1w, EURO_1w, 
EURO_1w, EURO_2w, EURO_2w, EURO_2w, EURO_2w, EURO_2m, EURO_2m, 
EURO_2m, EURO_2m), rates = c(2.05, 2.07, 2.06, 2.06, 2.22, 2.24, 2.23, 
2.23, 2.31, 2.33, 2.33, 2.31, 1.06, 1.08, 1.08, 1.08, 1.21, 1.21, 1.23, 1.21, 
1.41, 1.39, 1.39, 1.37, 1.82, 1.82, 1.81, 1.80, 1.98, 1.98, 1.97, 1.97, 2.1, 
2.09, 2.09, 2.11))





currency = c(EURO, GBP, USD)

tenor = c(1w, 2w, 1m, 2m, 3m)



# _



 df

   rate_name rates

   rate_name rates

1 USD_1w  2.05

2 USD_1w  2.07

3 USD_1w  2.06

4 USD_1w  2.06

5 USD_1m  2.22

6 USD_1m  2.24

7 USD_1m  2.23

8 USD_1m  2.23

9 USD_2m  2.31

10    USD_2m  2.33

11    USD_2m  2.33

12    USD_2m  2.31

13    GBP_1w  1.06

14    GBP_1w  1.08

15    GBP_1w  1.08

16    GBP_1w  1.08

17    GBP_1m  1.21

18    GBP_1m  1.21

19    GBP_1m  1.23

20    GBP_1m  1.21

21    GBP_2m  1.41

22    GBP_2m  1.39

23    GBP_2m  1.39

24    GBP_2m  1.37

25   EURO_1w  1.82

26   EURO_1w  1.82

27   EURO_1w  1.81

28   EURO_1w  1.80

29   EURO_2w  1.98

30   EURO_2w  1.98

31   EURO_2w  1.97

32   EURO_2w  1.97

33   EURO_2m  2.10

34   EURO_2m  2.09

35   EURO_2m  2.09

36   EURO_2m  2.11



As can be seen that USD_2w, GBP_2w and EURO_1m are missing and I need to 
INTERPOLATE these rates, which can be done using approx or approxfun. In 
reality I can have many currencies with many tenors. Problem is when the 
data.frame df is read or accessed in R, I am not aware which tenor is 
missing. For a given currency, it is possible that mare than 1 consecutive 
tenors may be missing e.g. in case of EURO, I may have EURO_1w, EURO_2w and 
then EURO_4m. So EURO_1m, EURO_2m and EURO_3m are missing.







I understand it's sort of vague question from me and do apologize for the same. 
Any suggestion please.



Regards



Katherine











        [[alternative HTML version deleted]]




__

R-help@r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decomposing a List

2013-04-25 Thread Bert Gunter
Well, what you really want to do is convert the list to a matrix, and
it can be done directly and considerably faster than with the
(implicit) looping of sapply:

f1 - function(l)sapply(l,[,1)
f2 - function(l)matrix(unlist(l),nr=2)
l - 
strsplit(paste(sample(LETTERS,1e6,rep=TRUE),sample(1:10,1e6,rep=TRUE),sep=+),+,fix=TRUE)

## Then you get these results:

 system.time(x1 - f1(l))
   user  system elapsed
   1.920.011.95
 system.time(x2 - f2(l))
   user  system elapsed
   0.060.020.08
 system.time(x2 - f2(l)[1,])
   user  system elapsed
0.1 0.0 0.1
 identical(x1,x2)
[1] TRUE


Cheers,
Bert






On Thu, Apr 25, 2013 at 3:32 AM, Ted Harding ted.hard...@wlandres.net wrote:
 Thanks, Jorge, that seems to work beautifully!
 (Now to try to understand why ... but that's for later).
 Ted.

 On 25-Apr-2013 10:21:29 Jorge I Velez wrote:
 Dear Dr. Harding,

 Try

 sapply(L, [, 1)
 sapply(L, [, 2)

 HTH,
 Jorge.-



 On Thu, Apr 25, 2013 at 8:16 PM, Ted Harding ted.hard...@wlandres.netwrote:

 Greetings!
 For some reason I am not managing to work out how to do this
 (in principle) simple task!

 As a result of applying strsplit() to a vector of character strings,
 I have a long list L (N elements), where each element is a vector
 of two character strings, like:

   L[1] = c(A1,B1)
   L[2] = c(A2,B2)
   L[3] = c(A3,B3)
   [etc.]

 From L, I wish to obtain (as directly as possible, e.g. avoiding
 a loop) two vectors each of length N where one contains the strings
 that are first in the pair, and the other contains the strings
 which are second, i.e. from L (as above) I would want to extract:

   V1 = c(A1,A2,A3,...)
   V2 = c(B1,B2,B3,...)

 Suggestions?

 With thanks,
 Ted.

 -
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Date: 25-Apr-2013  Time: 11:16:46
 This message was sent by XFMail

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 -
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Date: 25-Apr-2013  Time: 11:31:57
 This message was sent by XFMail

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trouble Computing Type III SS in a Cox Regression

2013-04-25 Thread Bert Gunter
Please take this discussion offlist. It is **not** about R.

-- Bert

On Thu, Apr 25, 2013 at 5:59 AM, Paul Miller pjmiller...@yahoo.com wrote:
 Hi Dr. Therneau,

 Thanks for your reply to my question. I'm aware that many on the list do not 
 like type III SS. I'm not particularly attached to the idea of using them but 
 often produce output for others who see value in type III SS.

 You mention the problems with type III SS when testing interactions. I don't 
 think we'll be doing that here though. So my type III SS could just as easily 
 be called type II SS I think. If the SS I'm calculating are essentially type 
 II SS, is that still problematic for a Cox model?

 People using type III SS generally want a measure of whether or not a 
 variable is contributing something to their model or if it could just as 
 easily be discarded. Is there a better way of addressing this question than 
 by using type III (or perhaps type II) SS?

 A series of model comparisons using a LRT might be the answer. If it is, is 
 there an efficient way of implementing this approach when there are many 
 predictors? Another approach might be to run models through step or stepAIC 
 in order to determine which predictors are useful and to discard the rest. Is 
 that likely to be any good?

 Thanks,

 Paul

 --- On Wed, 4/24/13, Terry Therneau thern...@mayo.edu wrote:

 From: Terry Therneau thern...@mayo.edu
 Subject: Re:  Trouble Computing Type III SS in a Cox Regression
 To: r-help@r-project.org, Paul Miller pjmiller...@yahoo.com
 Received: Wednesday, April 24, 2013, 5:55 PM
 I should hope that there is trouble,
 since type III is an undefined concept for a Cox
 model.  Since SAS Inc fostered the cult of type III
 they have recently added it as an option for phreg, but I am
 not able to find any hints in the phreg documentation of
 what exactly they are doing when you invoke it.  If you
 can unearth this information, then I will be happy to tell
 you whether
a. using the test (whatever it is) makes
 any sense at all for your data set
b. if a is true, how to get it out of R

 I use the word cult on purpose -- an entire generation of
 users who believe in the efficacy of this incantation
 without having any idea what it actually does.  In many
 particular instances the SAS type III corresponds to a
 survey sampling question, i.e., reweight the data so that it
 is balanced wrt factor A and then test factor B in the new
 sample.  The three biggest problems with type III are
 that
 1: the particular test has been hyped as better when in
 fact it sometimes is sensible and sometimes not, 2: SAS
 implemented it as a computational algorithm which
 unfortunately often works even when the underlying rationale
 does not hold and
 3: they explain it using a notation that completely obscures
 the actual question.  This last leads to the nonsense
 phrase test for main effects in the presence of
 interactions.

 There is a survey reweighted approach for Cox models, very
 closely related to the work on causal inference (marginal
 structural models), but I'd bet dollars to donuts that this
 is not what SAS is doing.

 (Per 2 -- type III was a particular order of operations of
 the sweep algorithm for linear models, and for backwards
 compatability that remains the core definition even as
 computational algorthims have left sweep behind.  But
 Cox models can't be computed using the sweep algorithm).

 Terry Therneau


 On 04/24/2013 12:41 PM, r-help-requ...@r-project.org
 wrote:
  Hello All,
 
  Am having some trouble computing Type III SS in a Cox
 Regression using either drop1 or Anova from the car package.
 Am hoping that people will take a look to see if they can
 tell what's going on.
 
  Here is my R code:
 
  cox3grp- subset(survData,
  Treatment %in% c(DC, DA, DO),
  c(PTNO, Treatment, PFS_CENSORED, PFS_MONTHS,
 AGE, PS2))
  cox3grp- droplevels(cox3grp)
  str(cox3grp)
 
  coxCV- coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~
 AGE + PS2, data=cox3grp, method = efron)
  coxCV
 
  drop1(coxCV, test=Chisq)
 
  require(car)
  Anova(coxCV, type=III)
 
  And here are my results:
 
  cox3grp- subset(survData,
  +
  Treatment %in% c(DC, DA,
 DO),
  +
  c(PTNO, Treatment,
 PFS_CENSORED, PFS_MONTHS, AGE, PS2))
cox3grp- droplevels(cox3grp)
str(cox3grp)
  'data.frame':227 obs. of  6
 variables:
$ PTNO:
 int  1195997 104625 106646 1277507 220506 525343 789119
 817160 824224 82632 ...
$ Treatment   : Factor
 w/ 3 levels DC,DA,DO: 1 1 1 1 1 1 1 1 1 1 ...
$ PFS_CENSORED: int  1 1 1 0 1 1
 1 1 0 1 ...
$ PFS_MONTHS  : num  1.12
 8.16 6.08 1.35 9.54 ...
$ AGE
: num  72 71 80 65 72 60 63 61 71 70
 ...
$ PS2
: Ord.factor w/ 2 levels YesNo: 2
 2 2 2 2 2 2 2 2 2 ...
  coxCV-
 coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2,
 data=cox3grp, method = efron)
coxCV
  Call:
  coxph(formula = Surv(PFS_MONTHS, PFS_CENSORED == 1) ~
 AGE + PS2,
   data = cox3grp, method = efron)
 
 

[R] problem with geom_point in ggplot using a different column

2013-04-25 Thread Angel Russo
I want to draw boxplot where the geom_points are displayed based on
ERBB2.MUT subset and they should be displayed in the right box (based
both on the ERBB2.2064 field and ERBB2_Status).

However, given my command I currently only see red points corresponding
to MUT subset in one straight line corresponding to only ERBB2.2064
stratification on x-axis. It dosen't take into account the ERBB2.Status
stratification. Can anyone help me?

Call ERBB2|2064 ERBB2_Status ERBB2-MUT
A 7.214E-01 CHANGE MUT
B -4.208E-02 NEUTRAL MUT
D 1.080E+00 NEUTRAL MUT
C 2.347E-01 NEUTRAL MUT

ggplot(data=testdata, aes(x=Call, y=ERBB2.2064)) +
geom_boxplot(aes(fill=ERBB2_Status),width=0.8)+theme_bw()+geom_point(data=subset(testdata,ERBB2.MUT==MUT),aes(shape=Call,color=Red))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trouble Computing Type III SS in a Cox Regression

2013-04-25 Thread Terry Therneau
You've missed the point of my earlier post, which is that type III is not an answerable 
question.


   1. There are lots of ways to compare Cox models, LRT is normally considered the most 
reliable by serious authors.  There is usually not much difference between score, Wald, 
and LRT tests though, and the other two are more convenient in many situations.


   2. Type III is a question that can't be addressed. SAS prints something out with 
that label, but since they don't document what it is, and people with in-depth knowlegde 
of Cox models (like me) cannot figure out what a sensible definition could actually be, 
there is nowhere to go.  How to do this in R can't be answered.  (It has nothing to do 
with interactions.)


  3. If you have customers who think that the earth is flat, global warming is a 
conspiracy, or that type III has special meaning this is a re-education issue, and I can't 
much help with that.


Terry T.

On 04/25/2013 07:59 AM, Paul Miller wrote

Hi Dr. Therneau,

Thanks for your reply to my question. I'm aware that many on the list do not 
like type III SS. I'm not particularly attached to the idea of using them but 
often produce output for others who see value in type III SS.

You mention the problems with type III SS when testing interactions. I don't 
think we'll be doing that here though. So my type III SS could just as easily 
be called type II SS I think. If the SS I'm calculating are essentially type II 
SS, is that still problematic for a Cox model?

People using type III SS generally want a measure of whether or not a variable 
is contributing something to their model or if it could just as easily be 
discarded. Is there a better way of addressing this question than by using type 
III (or perhaps type II) SS?

A series of model comparisons using a LRT might be the answer. If it is, is 
there an efficient way of implementing this approach when there are many 
predictors? Another approach might be to run models through step or stepAIC in 
order to determine which predictors are useful and to discard the rest. Is that 
likely to be any good?

Thanks,

Paul


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tables: proper use of Hline() in tabular()

2013-04-25 Thread Liviu Andronic
Dear all,
I am unable to understand how Hline() works in tabular(). I've read
the vignette and the help page, and here this example compiles
perfectly fine:
latex( tabular( Species + Hline() + 1
~ Heading()*mean*All(iris), data=iris) )

However, if I try it on my own data it fails. Consider this:
set.seed(1)
Xa - data.frame(p=rep(c(First group,Second group,Third
group),each=10,len=30),
a=sample(c(Some long label,Some other long label,
Yet another label),
 30, replace=TRUE),id=seq(30),
b=round(runif(30,10,20)),
c=round(runif(30,40,70)))

(x - tabular(((p=factor(p))*(a=factor(a))+1) ~ (N = 1) + (b + c)*
  (mean+sd),data=Xa))

 pa N  mean  sdmean  sd
 First group  Some long label3 15.67 2.082 64.33  3.786
  Some other long label  4 14.75 2.630 52.25  8.461
  Yet another label  3 15.67 3.215 50.67  3.055
 Second group Some long label2 17.00 1.414 57.50 10.607
  Some other long label  3 17.00 1.000 60.00  8.888
  Yet another label  5 15.00 3.082 58.20  8.672
 Third group  Some long label4 13.75 3.594 58.75  5.909
  Some other long label  4 13.50 1.732 46.50  3.786
  Yet another label  2 16.00 1.414 50.00  4.243
  All   30 15.13 2.501 55.37  8.045


I would like to place an Hline() between rows 3:4, rows 6:7, rows
9:10. But either way I place it I get something that doesn't compile
in LaTeX (! Misplaced \noalign. error). For example,
x - tabular(((p=factor(p)))*(a=factor(a)) +(Hline() + 1) ~ (N = 1) + (b + c)*
  (mean+sd),data=Xa)
latex(x)

\begin{tabular}{llc}
\hline
\multicolumn{2}{c}{b}  \multicolumn{2}{c}{c} \\
p  a  N  mean  sd  mean  \multicolumn{1}{c}{sd} \\
\hline
First group  Some long label   $\phantom{0}3$  $15.67$  $2.082$ 
$64.33$  $\phantom{0}3.786$ \\
  Some other long label   $\phantom{0}4$  $14.75$  $2.630$ 
$52.25$  $\phantom{0}8.461$ \\
  Yet another label   $\phantom{0}3$  $15.67$  $3.215$  $50.67$ 
$\phantom{0}3.055$ \\
Second group  Some long label   $\phantom{0}2$  $17.00$  $1.414$ 
$57.50$  $10.607$ \\
  Some other long label   $\phantom{0}3$  $17.00$  $1.000$ 
$60.00$  $\phantom{0}8.888$ \\
  Yet another label   $\phantom{0}5$  $15.00$  $3.082$  $58.20$ 
$\phantom{0}8.672$ \\
Third group  Some long label   $\phantom{0}4$  $13.75$  $3.594$ 
$58.75$  $\phantom{0}5.909$ \\
  Some other long label   $\phantom{0}4$  $13.50$  $1.732$ 
$46.50$  $\phantom{0}3.786$ \\
  Yet another label   $\phantom{0}2$  $16.00$  $1.414$  $50.00$ 
$\phantom{0}4.243$ \\
  \hline %\\
  All   $30$  $15.13$  $2.501$  $55.37$  $\phantom{0}8.045$ \\
\hline
\end{tabular}


Please advise how to use Hline() in the example above. Regards,
Liviu

-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [SQL]

2013-04-25 Thread Ignacio Martinez
Hi,

The data for my new project are in a bunch of .sql files, instead of the
clasic csv files that I'm used to work with.

Could someone explain to me how to read these files into R?

Thanks,

-Ignacio

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Predictions with missing inputs

2013-04-25 Thread tonitogomez
Hi Bill,
Very clear response.
How about when the missing values are on the response variable being
predicted (y)? That is, the model is fitted only to complete cases, but then
I want to have predictions for all individual y (including those missing).
Can I use the mean for that variable 'y'?

EXAMPLE:
mynewdata - mydata
mynewdata$y-mean(mydata$y)
mypred - predict(mymodel, mynewdata) 

Thanks,
Manuel



--
View this message in context: 
http://r.789695.n4.nabble.com/Predictions-with-missing-inputs-tp3302303p4665411.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Make R 3.0 open .RData files

2013-04-25 Thread Dimitri Liakhovitski
Brian, how do I remove the relevant old Registry entries?
Thank you!
Dimitri


On Thu, Apr 25, 2013 at 10:29 AM, Prof Brian Ripley
rip...@stats.ox.ac.ukwrote:

 On 25/04/2013 14:00, Duncan Murdoch wrote:

 On 13-04-25 8:33 AM, Dimitri Liakhovitski wrote:

 Hello!

 I have Windows 7 Enterprise and two versions of R installed: 2.15.3 and
 3.0.0.
 Before I had R 3.0 I made it a setting that all .RData files - when I
 double-click on them - were opened by R 2.15.3.
 Now I want them to be opened by R 3.0 instead of R 2.15.3 (but I don't
 want
 to remove R 2.15.3. yet).

 I right-click on some .RData file, select Open with - Choose default
 program and then click on Browse.

 I browse to the folder where my R 3.0 is installed, then to the folder
 bin, then to the folder x64 and select Rgui.exe.
 However, when R opens - or after I shut R down and then double-click on
 some .RData file and R opens, it is again R 2.15.3, not R3.0.

 What am I doing wrong?

 Of course, when I open R 3.0 directly, then it opens no problem.


 This is really a question about Windows 7, not about R, but I would
 guess you aren't telling it to make your choice permanent, or perhaps
 you are not allowed by your administrator to make permanent changes to
 file associations.  You should ask for local help.


 We've encountered this for our student accounts, and think it is a bug in
 Windows 7.  If you remove the relevant old Registry entries first it should
 work.


 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  
 http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595




-- 
Dimitri Liakhovitski

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Make R 3.0 open .RData files

2013-04-25 Thread Prof Brian Ripley

On 25/04/2013 17:15, Dimitri Liakhovitski wrote:

Brian, how do I remove the relevant old Registry entries?


That is not an R question.   Our sysadmins do it 


Thank you!
Dimitri


On Thu, Apr 25, 2013 at 10:29 AM, Prof Brian Ripley
rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk wrote:

On 25/04/2013 14:00, Duncan Murdoch wrote:

On 13-04-25 8:33 AM, Dimitri Liakhovitski wrote:

Hello!

I have Windows 7 Enterprise and two versions of R installed:
2.15.3 and
3.0.0.
Before I had R 3.0 I made it a setting that all .RData files
- when I
double-click on them - were opened by R 2.15.3.
Now I want them to be opened by R 3.0 instead of R 2.15.3
(but I don't
want
to remove R 2.15.3. yet).

I right-click on some .RData file, select Open with -
Choose default
program and then click on Browse.

I browse to the folder where my R 3.0 is installed, then to
the folder
bin, then to the folder x64 and select Rgui.exe.
However, when R opens - or after I shut R down and then
double-click on
some .RData file and R opens, it is again R 2.15.3, not R3.0.

What am I doing wrong?

Of course, when I open R 3.0 directly, then it opens no problem.


This is really a question about Windows 7, not about R, but I would
guess you aren't telling it to make your choice permanent, or
perhaps
you are not allowed by your administrator to make permanent
changes to
file associations.  You should ask for local help.


We've encountered this for our student accounts, and think it is a
bug in Windows 7.  If you remove the relevant old Registry entries
first it should work.


--
Brian D. Ripley, rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk
Professor of Applied Statistics,
http://www.stats.ox.ac.uk/~__ripley/
http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861
tel:%2B44%201865%20272861 (self)
1 South Parks Road, +44 1865 272866 tel:%2B44%201865%20272866 (PA)
Oxford OX1 3TG, UKFax: +44 1865 272595
tel:%2B44%201865%20272595




--
Dimitri Liakhovitski



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Make R 3.0 open .RData files

2013-04-25 Thread Jeff Newmiller
a) See FAQ 2.17

b) Methods for configuring operating systems are off topic here. I will say 
there is a REGEDIT program in Windows, but there are potential permissions 
complications (you may not have them) and possible collateral damage (don't 
touch it if you don't understand it) that mean you should study up on this 
topic with an appropriate resource (book, forum, expert, system administrator, 
etc.) before attempting it.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote:

Brian, how do I remove the relevant old Registry entries?
Thank you!
Dimitri


On Thu, Apr 25, 2013 at 10:29 AM, Prof Brian Ripley
rip...@stats.ox.ac.ukwrote:

 On 25/04/2013 14:00, Duncan Murdoch wrote:

 On 13-04-25 8:33 AM, Dimitri Liakhovitski wrote:

 Hello!

 I have Windows 7 Enterprise and two versions of R installed: 2.15.3
and
 3.0.0.
 Before I had R 3.0 I made it a setting that all .RData files - when
I
 double-click on them - were opened by R 2.15.3.
 Now I want them to be opened by R 3.0 instead of R 2.15.3 (but I
don't
 want
 to remove R 2.15.3. yet).

 I right-click on some .RData file, select Open with - Choose
default
 program and then click on Browse.

 I browse to the folder where my R 3.0 is installed, then to the
folder
 bin, then to the folder x64 and select Rgui.exe.
 However, when R opens - or after I shut R down and then
double-click on
 some .RData file and R opens, it is again R 2.15.3, not R3.0.

 What am I doing wrong?

 Of course, when I open R 3.0 directly, then it opens no problem.


 This is really a question about Windows 7, not about R, but I would
 guess you aren't telling it to make your choice permanent, or
perhaps
 you are not allowed by your administrator to make permanent changes
to
 file associations.  You should ask for local help.


 We've encountered this for our student accounts, and think it is a
bug in
 Windows 7.  If you remove the relevant old Registry entries first it
should
 work.


 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics, 
http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] connecting matrices

2013-04-25 Thread eliza botto
Thanks arun,The second one look ok..thanks indeed
Elisa

 Date: Thu, 25 Apr 2013 07:37:25 -0700
 From: smartpink...@yahoo.com
 Subject: Re: connecting matrices
 To: eliza_bo...@hotmail.com
 CC: r-help@r-project.org
 
 HI Elisa,
 I guess there is a mistake.
 Check whether this is what you wanted.
 
 indx-sort(el1,index.return=TRUE)$ix[1:3]
 list(el[,indx],indx)
 #[[1]]
 á# áá [,1] [,2] [,3]
 #[1,]áá 41áá 21áá 11
 #[2,]áá 42áá 22áá 12
 #[3,]áá 43áá 23áá 13
 #[4,]áá 44áá 24áá 14
 #[5,]áá 45áá 25áá 15
 #
 #[[2]]
 #[1] 9 5 3
 A.K.
 
 
 
 - Original Message -
 From: arun smartpink...@yahoo.com
 To: eliza botto eliza_bo...@hotmail.com
 Cc: R help r-help@r-project.org
 Sent: Thursday, April 25, 2013 10:09 AM
 Subject: Re: connecting matrices
 
 Dear Elisa,
 Try this:
 el- matrix(1:100,ncol=20)
 áset.seed(25)
 áel1- matrix(sample(1:100,20,replace=TRUE),ncol=1)
 
 In the example you showed, there were no column names.á 
 
 álist(el[,sort(el1)[1:3]],sort(el1,index.return=TRUE)$ix[1:3])
 #[[1]]
 á# áá [,1] [,2] [,3]
 #[1,]áá 31áá 61áá 71
 #[2,]áá 32áá 62áá 72
 #[3,]áá 33áá 63áá 73
 #[4,]áá 34áá 64áá 74
 #[5,]áá 35áá 65áá 75
 #
 #[[2]]
 #[1] 9 5 3
 A.K.
 
 
 
 
 
 
 From: eliza botto eliza_bo...@hotmail.com
 To: smartpink...@yahoo.com smartpink...@yahoo.com 
 Sent: Thursday, April 25, 2013 9:54 AM
 Subject: connecting matrices
 
 
 
 
 Dear Arun,
 
 [text file contains the exact format]
 Although the last codes were absolutely correct and worked the way i want 
 them to. I have an additional cover-up question.á
 Suppose i have a matrix el... here i show you only some part of that 
 matrix so that codes can work faster.
 
 el
 á á á[,595586] [,595587] [,595588] [,595589] [,595590] [,595591] [,595592] 
 [,595593] [,595594] [,595595] [,595596] [,595597] [,595598] [,595599] 
 [,595600] [,595601]
 [1,] á á á á55 á á á á55 á á á á55 á á á á55 á á á á55 á á á á55 á á á á55 á 
 á á á55 á á á á55 á á á á55 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 
 á á á á56
 [2,] á á á á59 á á á á59 á á á á59 á á á á59 á á á á59 á á á á59 á á á á60 á 
 á á á60 á á á á60 á á á á61 á á á á57 á á á á57 á á á á57 á á á á57 á á á á57 
 á á á á57
 [3,] á á á á60 á á á á60 á á á á60 á á á á61 á á á á61 á á á á62 á á á á61 á 
 á á á61 á á á á62 á á á á62 á á á á58 á á á á58 á á á á58 á á á á58 á á á á58 
 á á á á59
 [4,] á á á á61 á á á á62 á á á á63 á á á á62 á á á á63 á á á á63 á á á á62 á 
 á á á63 á á á á63 á á á á63 á á á á59 á á á á60 á á á á61 á á á á62 á á á á63 
 á á á á60
 á á á[,595602] [,595603] [,595604] [,595605] [,595606] [,595607] [,595608] 
 [,595609] [,595610] [,595611] [,595612] [,595613] [,595614] [,595615] 
 [,595616] [,595617]
 [1,] á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á 
 á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 
 á á á á56
 [2,] á á á á57 á á á á57 á á á á57 á á á á57 á á á á57 á á á á57 á á á á57 á 
 á á á57 á á á á57 á á á á58 á á á á58 á á á á58 á á á á58 á á á á58 á á á á58 
 á á á á58
 [3,] á á á á59 á á á á59 á á á á59 á á á á60 á á á á60 á á á á60 á á á á61 á 
 á á á61 á á á á62 á á á á59 á á á á59 á á á á59 á á á á59 á á á á60 á á á á60 
 á á á á60
 [4,] á á á á61 á á á á62 á á á á63 á á á á61 á á á á62 á á á á63 á á á á62 á 
 á á á63 á á á á63 á á á á60 á á á á61 á á á á62 á á á á63 á á á á61 á á á á62 
 á á á á63
 
 
 In connection to this matrix, there is another matrix which contains 
 coordination values for each of the column of matrix el
 
 el1
 
 [595586,] á 5.67 áá
 [595587,] á 55.90 áá
 [595588,] á 515 áá
 [595589,] á 755 áá
 [595590,] á 955 áá
 [595591,] á 5.95 áá
 [595592,] á 575 áá
 [595593,] á 505 áá
 [595594,] á 505 áá
 [595595,] á 515 áá
 [595596,] á 5612 áá
 [595597,] á 506 áá
 [595598,] á 576 áá
 [595599,] á 5126 áá
 [595600,] á 5216 áá
 [595601,] á 5666 áá
 [595602,] á 526 áá
 [595603,] á 5.6 áá
 [595604,] á 156 áá
 [595605,] á 4556 áá
 [595606,] á 5556 áá
 [595607,] á 1256 áá
 [595608,] á 1256 áá
 [595609,] á 8756 áá
 [595610,] á 5906 áá
 [595611,] á 789 áá
 [595612,] á 5006 áá
 [595613,] á 1256 áá
 [595614,] á 3356 áá
 [595615,] á 7756 áá
 [595616,] á 4456 áá
 [595617,] á 3356 áá
 
 What i want in the end is a list of two elemens containing the 10 column of 
 el which have the lowest values in matrix el1.
 
 More precisely
 [[1]]
 [,595603][,595586][595591,]
 56
 575959
 596062
 626163
 
 [[2]]
 5.65.675.95
 
 is it possible to carry out such operation??
 
 thanks for your help
 
 Elisaá á á 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [SQL]

2013-04-25 Thread MacQueen, Don
With so little information, one can only guess.

I would guess your .sql files contain scripts written in the SQL
language, in which case you will need some local database support to help
you run those scripts in whatever database has the data. Perhaps the
scripts will output csv files.

If it turns out that you need run the SQL scripts from within R, then
I'd suggest asking for help on R-sig-db.

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 4/25/13 9:09 AM, Ignacio Martinez ignaci...@gmail.com wrote:

Hi,

The data for my new project are in a bunch of .sql files, instead of the
clasic csv files that I'm used to work with.

Could someone explain to me how to read these files into R?

Thanks,

-Ignacio

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [SQL]

2013-04-25 Thread Jeff Newmiller
The format of files with a SQL extension are not necessarily well- defined. In 
most cases I have found, they are text files that contain SQL Data Definition 
Language statements (CREATE TABLE) and possibly Data Manipulation Language 
statements (INSERT INTO). You may be able to extract the portions of the files 
that contain data using read.csv and judicious use of the skip and nrow 
arguments, but you will have to first become familiar with the contents of the 
file using a text editor. If they are binary files, you may need to consult 
with the source of the data to identify the format used more precisely.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Ignacio Martinez ignaci...@gmail.com wrote:

Hi,

The data for my new project are in a bunch of .sql files, instead of
the
clasic csv files that I'm used to work with.

Could someone explain to me how to read these files into R?

Thanks,

-Ignacio

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with geom_point in ggplot using a different column

2013-04-25 Thread John Kane
https://github.com/hadley/devtools/wiki/Reproducibility

John Kane
Kingston ON Canada


 -Original Message-
 From: angerusso1...@gmail.com
 Sent: Thu, 25 Apr 2013 11:09:18 -0400
 To: r-help@r-project.org, r-help-requ...@r-project.org
 Subject: [R] problem with geom_point in ggplot using a different column
 
 I want to draw boxplot where the geom_points are displayed based on
 ERBB2.MUT subset and they should be displayed in the right box (based
 both on the ERBB2.2064 field and ERBB2_Status).
 
 However, given my command I currently only see red points corresponding
 to MUT subset in one straight line corresponding to only ERBB2.2064
 stratification on x-axis. It dosen't take into account the ERBB2.Status
 stratification. Can anyone help me?

Is this supposed to represent your data?

 Call ERBB2|2064 ERBB2_Status ERBB2-MUT
 A 7.214E-01 CHANGE MUT
 B -4.208E-02 NEUTRAL MUT
 D 1.080E+00 NEUTRAL MUT
 C 2.347E-01 NEUTRAL MUT
 
 ggplot(data=testdata, aes(x=Call, y=ERBB2.2064)) +
 geom_boxplot(aes(fill=ERBB2_Status),width=0.8)+theme_bw()+geom_point(data=subset(testdata,ERBB2.MUT==MUT),aes(shape=Call,color=Red))
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Send any screenshot to your friends in seconds...
Works in all emails, instant messengers, blogs, forums and social networks.
TRY IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if2 for FREE

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with package RNetCDF when attached

2013-04-25 Thread Marc Girondot

I have a problem with the RNetCDF package in MacOSX 10.8.3, R3.0.0.
If you have a solution, it would be great !
Thanks a lot.
Marc Girondot

 install.packages(RNetCDF)
essai de l'URL 
'http://cran.at.r-project.org/bin/macosx/contrib/3.0/RNetCDF_1.6.1-2.tgz'

Content type 'application/x-gzip' length 2071758 bytes (2.0 Mb)
URL ouverte
==
downloaded 2.0 Mb


The downloaded binary packages are in
/var/folders/6f/w2t25jws2ng_qqnvl4xgnnc0gn/T//Rtmphm82to/downloaded_packages
 library(RNetCDF, 
lib.loc=/Library/Frameworks/R.framework/Versions/3.0/Resources/library)

Error : .onLoad a échoué dans loadNamespace() pour 'RNetCDF', détails :
appel : NULL
erreur : I/O error (udunits)
Erreur : le chargement du package ou de l'espace de noms a échoué pour 
‘RNetCDF’


--
__
Marc Girondot, Pr

Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
e-mail: marc.giron...@u-psud.fr
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RStudio.. text editor

2013-04-25 Thread Santosh
Dear Rxperts/RStudio users,
Is there a way to set tabs (the TAB key) in the text editor of RStudio,
similar to the way customization can be done in Tinn-R?

Thanks and regards,
Santosh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RStudio.. text editor

2013-04-25 Thread Duncan Murdoch

On 25/04/2013 3:04 PM, Santosh wrote:

Dear Rxperts/RStudio users,
Is there a way to set tabs (the TAB key) in the text editor of RStudio,
similar to the way customization can be done in Tinn-R?


You're asking on the wrong list.  RStudio has its own support forums.  
Start on their web site...


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pglm package: fitted values and residuals

2013-04-25 Thread Paul Johnson
On Wed, Apr 24, 2013 at 4:37 PM, Achim Zeileis achim.zeil...@uibk.ac.at wrote:
 On Wed, 24 Apr 2013, Paul Johnson wrote:

 On Wed, Apr 24, 2013 at 3:11 AM,  alfonso.carf...@uniparthenope.it
 wrote:

 I'm using the package pglm and I'have estimated a random probit model.
 I need to save in a vector the fitted values and the residuals of the model
 but I can not do it.

 I tried with the command fitted.values using the following procedure
 without results:

 This is one of those ask the pglm authors questions. You should take it
 up with the authors of the package.  There is a specialized email list
 R-sig-mixed where you will find more people working on this exact same
 thing.

 pglm looks like fun to me, but it is not quite done, so far as I can tell.

 I'm sure that there are many. One of my attempts to write up a list is in
 Table 1 of vignette(betareg, package = betareg).

Yes! That's exactly the list I was thinking of.  It was driving me
crazy I could not find it.

Thanks for the explanation.  I don't think I should have implied that
the pglm author must actually implement all the methods, it is
certainly acceptable to leverage the methods that exist.  It just
happened that the ones I tested were not implemented by any of the
affiliated packages.

But this thread leads me to one question I've wondered about recently.

Suppose I run somebody's regression function and out comes an object.

Do we have a way to ask that object what are all of the methods that
might apply to you?  Here's why I wondered. You've noticed that
predict.lm has the interval=confidence argument, but predict.glm
does not. So if I receive a regression model, I'd like to say to it
do you have a predict method and if I could get that predict method,
I could check to see if there is a formal argument interval. If it
does not, maybe I'd craft one for them.

pj



 Personally, I don't write anova() methods for my model objects because I can
 leverage lrtest() and waldtest() from lmtest and linearHypothesis() and
 deltaMethod() from car as long as certain standard methods are available,
 including coef(), vcov(), logLik(), etc.

 Similarly, an AIC() method is typically not needed as long as logLik() is
 available. And BIC() works if nobs() is available in addition.

 Best,
 Z


 pj

 library(pglm)

 m1_S-pglm(Feed ~ Cons_PC_1 + imp_gen_1 + LGDP_PC_1 + lnEI_1 +

 SH_Ren_1,data,family=binomial(probit),model=random,method=bfgs,index=c(Year,IDCountry))

 m1_S$fitted.values
 residuals(m1)


 Can someone help me about it?

 Thanks


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





--
Paul E. Johnson
Professor, Political Science  Assoc. Director
1541 Lilac Lane, Room 504  Center for Research Methods
University of Kansas University of Kansas
http://pj.freefaculty.org   http://quant.ku.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] glmnet webinar Friday May 3 at 10am PDT

2013-04-25 Thread Trevor Hastie
I will be giving a webinar on glmnet on Friday May 3, 2013 at 10am PDT (pacific 
daylight time)
The one-hour webinar will consist of:

- Intro to lasso and elastic net regularization, and coefficient paths 
- Why is glmnet so efficient and flexible 
- New features of the latest version of glmnet 
- Live glmnet demonstration 
- Question and Answer period 

To sign up for the webinar, please go to
https://www3.gotomeeting.com/register/77950

The webinar is hosted by the Orange County R User Group., and will be moderated 
by its
president  Ray DiGiacomo


 

  Trevor Hastie   has...@stanford.edu  
  Professor, Department of Statistics, Stanford University
  Phone: (650) 725-2231 Fax: (650) 725-8977  
  URL: http://www.stanford.edu/~hastie  
   address: room 104, Department of Statistics, Sequoia Hall
   390 Serra Mall, Stanford University, CA 94305-4065  
 
--




[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RStudio.. text editor

2013-04-25 Thread John Kane
I have not use tinn-r in a while but Tools  Options  Code Editing perhaps?

John Kane
Kingston ON Canada


 -Original Message-
 From: santosh2...@gmail.com
 Sent: Thu, 25 Apr 2013 12:04:17 -0700
 To: r-help@r-project.org
 Subject: [R] RStudio.. text editor
 
 Dear Rxperts/RStudio users,
 Is there a way to set tabs (the TAB key) in the text editor of RStudio,
 similar to the way customization can be done in Tinn-R?
 
 Thanks and regards,
 Santosh
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pglm package: fitted values and residuals

2013-04-25 Thread Ista Zahn
On Thu, Apr 25, 2013 at 3:14 PM, Paul Johnson pauljoh...@gmail.com wrote:
 On Wed, Apr 24, 2013 at 4:37 PM, Achim Zeileis achim.zeil...@uibk.ac.at 
 wrote:
 On Wed, 24 Apr 2013, Paul Johnson wrote:

 On Wed, Apr 24, 2013 at 3:11 AM,  alfonso.carf...@uniparthenope.it
 wrote:

 I'm using the package pglm and I'have estimated a random probit model.
 I need to save in a vector the fitted values and the residuals of the model
 but I can not do it.

 I tried with the command fitted.values using the following procedure
 without results:

 This is one of those ask the pglm authors questions. You should take it
 up with the authors of the package.  There is a specialized email list
 R-sig-mixed where you will find more people working on this exact same
 thing.

 pglm looks like fun to me, but it is not quite done, so far as I can tell.

 I'm sure that there are many. One of my attempts to write up a list is in
 Table 1 of vignette(betareg, package = betareg).

 Yes! That's exactly the list I was thinking of.  It was driving me
 crazy I could not find it.

 Thanks for the explanation.  I don't think I should have implied that
 the pglm author must actually implement all the methods, it is
 certainly acceptable to leverage the methods that exist.  It just
 happened that the ones I tested were not implemented by any of the
 affiliated packages.

 But this thread leads me to one question I've wondered about recently.

 Suppose I run somebody's regression function and out comes an object.

 Do we have a way to ask that object what are all of the methods that
 might apply to you?

Yes, minus the might:

library(pglm)
example(pglm) # produces an object named la
sapply(class(la), function(x) methods(class=x)) # lists functions with
methods for objects of this class

Best,
Ista

Here's why I wondered. You've noticed that
 predict.lm has the interval=confidence argument, but predict.glm
 does not. So if I receive a regression model, I'd like to say to it
 do you have a predict method and if I could get that predict method,
 I could check to see if there is a formal argument interval. If it
 does not, maybe I'd craft one for them.

 pj



 Personally, I don't write anova() methods for my model objects because I can
 leverage lrtest() and waldtest() from lmtest and linearHypothesis() and
 deltaMethod() from car as long as certain standard methods are available,
 including coef(), vcov(), logLik(), etc.

 Similarly, an AIC() method is typically not needed as long as logLik() is
 available. And BIC() works if nobs() is available in addition.

 Best,
 Z


 pj

 library(pglm)

 m1_S-pglm(Feed ~ Cons_PC_1 + imp_gen_1 + LGDP_PC_1 + lnEI_1 +

 SH_Ren_1,data,family=binomial(probit),model=random,method=bfgs,index=c(Year,IDCountry))

 m1_S$fitted.values
 residuals(m1)


 Can someone help me about it?

 Thanks


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 --
 Paul E. Johnson
 Professor, Political Science  Assoc. Director
 1541 Lilac Lane, Room 504  Center for Research Methods
 University of Kansas University of Kansas
 http://pj.freefaculty.org   http://quant.ku.edu

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RStudio.. text editor

2013-04-25 Thread Santosh
Great Thanks so much!



On Thu, Apr 25, 2013 at 12:30 PM, John Kane jrkrid...@inbox.com wrote:

 I have not use tinn-r in a while but Tools  Options  Code Editing
 perhaps?

 John Kane
 Kingston ON Canada


  -Original Message-
  From: santosh2...@gmail.com
  Sent: Thu, 25 Apr 2013 12:04:17 -0700
  To: r-help@r-project.org
  Subject: [R] RStudio.. text editor
 
  Dear Rxperts/RStudio users,
  Is there a way to set tabs (the TAB key) in the text editor of RStudio,
  similar to the way customization can be done in Tinn-R?
 
  Thanks and regards,
  Santosh
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 
 FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
 Check it out at http://www.inbox.com/earth




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in validObject(.Object) :

2013-04-25 Thread Vahe nr
Hi all,

I am trying to run R NDVITS package, and I am getting the following error:
Error in validObject(.Object) :
  invalid class “GridTopology” object: cells.dim has incorrect dimension

Can you please suggest any idea about understanding this error and solving it.


Regards,
 Vahe

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pglm package: fitted values and residuals

2013-04-25 Thread Achim Zeileis

On Thu, 25 Apr 2013, Ista Zahn wrote:


On Thu, Apr 25, 2013 at 3:14 PM, Paul Johnson pauljoh...@gmail.com wrote:

On Wed, Apr 24, 2013 at 4:37 PM, Achim Zeileis achim.zeil...@uibk.ac.at wrote:

On Wed, 24 Apr 2013, Paul Johnson wrote:


On Wed, Apr 24, 2013 at 3:11 AM,  alfonso.carf...@uniparthenope.it
wrote:


I'm using the package pglm and I'have estimated a random probit model.
I need to save in a vector the fitted values and the residuals of the model
but I can not do it.

I tried with the command fitted.values using the following procedure
without results:


This is one of those ask the pglm authors questions. You should take it
up with the authors of the package.  There is a specialized email list
R-sig-mixed where you will find more people working on this exact same
thing.

pglm looks like fun to me, but it is not quite done, so far as I can tell.


I'm sure that there are many. One of my attempts to write up a list is in
Table 1 of vignette(betareg, package = betareg).


Yes! That's exactly the list I was thinking of.  It was driving me
crazy I could not find it.

Thanks for the explanation.  I don't think I should have implied that
the pglm author must actually implement all the methods, it is
certainly acceptable to leverage the methods that exist.  It just
happened that the ones I tested were not implemented by any of the
affiliated packages.

But this thread leads me to one question I've wondered about recently.

Suppose I run somebody's regression function and out comes an object.

Do we have a way to ask that object what are all of the methods that
might apply to you?


Yes, minus the might:

library(pglm)
example(pglm) # produces an object named la
sapply(class(la), function(x) methods(class=x)) # lists functions with
methods for objects of this class


Well, this shows you the methods that are available for the class but not 
necessarily what arguments are supported. And even if the arguments are 
available they do not necessarily mean the same thing. And some things may 
or may not work via inheritance...


So coming back to Paul's question: Yes, I think it would be nice to have 
support for this and in fact I have thought about similar infrastructure. 
But so far I didn't have a good idea for a sufficiently robust/reliable 
implementation. There are just so many details in the different model 
objects that can be handled differently.


Best,
Z


Best,
Ista

Here's why I wondered. You've noticed that

predict.lm has the interval=confidence argument, but predict.glm
does not. So if I receive a regression model, I'd like to say to it
do you have a predict method and if I could get that predict method,
I could check to see if there is a formal argument interval. If it
does not, maybe I'd craft one for them.

pj




Personally, I don't write anova() methods for my model objects because I can
leverage lrtest() and waldtest() from lmtest and linearHypothesis() and
deltaMethod() from car as long as certain standard methods are available,
including coef(), vcov(), logLik(), etc.

Similarly, an AIC() method is typically not needed as long as logLik() is
available. And BIC() works if nobs() is available in addition.

Best,
Z



pj


library(pglm)

m1_S-pglm(Feed ~ Cons_PC_1 + imp_gen_1 + LGDP_PC_1 + lnEI_1 +

SH_Ren_1,data,family=binomial(probit),model=random,method=bfgs,index=c(Year,IDCountry))

m1_S$fitted.values
residuals(m1)


Can someone help me about it?

Thanks



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







--
Paul E. Johnson
Professor, Political Science  Assoc. Director
1541 Lilac Lane, Room 504  Center for Research Methods
University of Kansas University of Kansas
http://pj.freefaculty.org   http://quant.ku.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trouble Computing Type III SS in a Cox Regression

2013-04-25 Thread Rolf Turner

On 26/04/13 03:40, Terry Therneau wrote:

(In response to a question about computing type III sums of squares in a
Cox regression):

SNIP


If you have customers who think that the earth is flat, global warming 
is a conspiracy, or that type III has special meaning this is a 
re-education issue, and I can't much help with that.


Fortune nomination!

cheers,

Rolf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Transferring R to another computer, R_HOME_DIR

2013-04-25 Thread Saptarshi Guha
Hello,

I was looking at the R (installed on RHEL6) shell script and saw
R_HOME_DIR=/usr/lib64/R. Nowhere (and I could have got it wrong) does
it read in the environment value R_HOME_DIR. I have the need to rsync
the entire folder below /usr/lib64/R to another computer into another
directory location. Without changing the R shell script, how can i
force it read in R_HOME_DIR?

Or maybe i misunderstood the bash source?

(Note, i cannot recompile on target machine)

Cheers
Saptarshi

1. I also realize Rscript will not work (i think path is hard coded in the
source)

Beginning of /usr/lib64/R/bin/R

R_HOME_DIR=/usr/lib64/R
if test ${R_HOME_DIR} = /usr/lib64/R; then
   case linux-gnu in
   linux*)
 run_arch=`uname -m`
 case $run_arch in
x86_64|mips64|ppc64|powerpc64|sparc64|s390x)
  libnn=lib64
  libnn_fallback=lib
;;
*)
  libnn=lib
  libnn_fallback=lib64
;;
 esac
 if [ -x /usr/${libnn}/R/bin/exec/R ]; then
R_HOME_DIR=/usr/lib64/R
 elif [ -x /usr/${libnn_fallback}/R/bin/exec/R ]; then
R_HOME_DIR=/usr/lib64/R
 ## else -- leave alone (might be a sub-arch)
 fi
 ;;
  esac
fi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transferring R to another computer, R_HOME_DIR

2013-04-25 Thread p_connolly

Quoting Saptarshi Guha saptarshi.g...@gmail.com:


Hello,

I was looking at the R (installed on RHEL6) shell script and saw
R_HOME_DIR=/usr/lib64/R. Nowhere (and I could have got it wrong) does
it read in the environment value R_HOME_DIR. I have the need to rsync
the entire folder below /usr/lib64/R to another computer into another
directory location. Without changing the R shell script, how can i
force it read in R_HOME_DIR?

Or maybe i misunderstood the bash source?

(Note, i cannot recompile on target machine)


If you can't compile on the target machine, that indicates that you wouldn't
have access to /usr/lib64/R anyway, so you need a different approach.

Fortunately, it's easy to compile into your home directory where you do have
write access.  The INSTALL file in the distributed tar.gz file shows 
you how to

compile where you want and what link you need to make it accessible.  Even
though the file is called INSTALL, it explains how it's not necessary to
install R in order to use it.

HTH








Cheers
Saptarshi

1. I also realize Rscript will not work (i think path is hard coded in the
source)

Beginning of /usr/lib64/R/bin/R

R_HOME_DIR=/usr/lib64/R
if test ${R_HOME_DIR} = /usr/lib64/R; then
   case linux-gnu in
   linux*)
 run_arch=`uname -m`
 case $run_arch in
x86_64|mips64|ppc64|powerpc64|sparc64|s390x)
  libnn=lib64
  libnn_fallback=lib
;;
*)
  libnn=lib
  libnn_fallback=lib64
;;
 esac
 if [ -x /usr/${libnn}/R/bin/exec/R ]; then
R_HOME_DIR=/usr/lib64/R
 elif [ -x /usr/${libnn_fallback}/R/bin/exec/R ]; then
R_HOME_DIR=/usr/lib64/R
 ## else -- leave alone (might be a sub-arch)
 fi
 ;;
  esac
fi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [newbie] how to find and combine geographic maps with particular features?

2013-04-25 Thread Tom Roche

SUMMARY:

Specific problem: I'm regridding biomass-burning emissions from a
global/unprojected inventory to a regional projection (LCC over North
America). I need to have boundaries for Canada, Mexico, and US
(including US states), but also Caribbean and Atlantic nations
(notably the Bahamas). I would also like to add Canadian provinces and
Mexican states. How to put these together?

General problem: are there references regarding

* sources for different geographical and political features?

* combining maps for the different R graphics packages?

DETAILS:

(Apologies if this is a FAQ, but googling has not helped me with this.)

I'd appreciate help with a specific problem, as well as guidance
(e.g., pointers to docs) regarding the larger topic of combining
geographical maps (especially projected ones, i.e., not just lon-lat)
on plots of regional data (i.e., data that is multinational but not
global).

My specific problem is

https://bitbucket.org/tlroche/gfed-3.1_global_to_aqmeii-na/downloads/GFED-3.1_2008_N2O_monthly_emissions_regrid_20130404_1344.pdf

which plots N2O concentrations from a global inventory of fire
emissions (GFED) regridded to a North American projection. (See

https://bitbucket.org/tlroche/gfed-3.1_global_to_aqmeii-na

for details.) The plot currently includes boundaries for Canada,
Mexico, and US (including US states, since this is being done for a US
agency), which are being gotten calling code from package=M3

http://cran.r-project.org/web/packages/M3/

like

https://bitbucket.org/tlroche/gfed-3.1_global_to_aqmeii-na/src/95484c5d63502ab146402cedc3612dcdaf629bd7/vis_regrid_vis.r?at=master
 ## get projected North American map
 NorAm.shp - project.NorAm.boundaries.for.CMAQ(
   units='m',
   extents.fp=template_input_fp,
   extents=template.extents,
   LCC.parallels=c(33,45),
   CRS=out.crs)

https://bitbucket.org/tlroche/gfed-3.1_global_to_aqmeii-na/src/95484c5d63502ab146402cedc3612dcdaf629bd7/visualization.r?at=master
 # database: Geographical database to use.  Choices include state
 #   (default), world, worldHires, canusamex, etc.  Use
 #   canusamex to get the national boundaries of the Canada, the
 #   USA, and Mexico, along with the boundaries of the states.
 #   The other choices (state, world, etc.) are the names of
 #   databases included with the ‘maps’ and ‘mapdata’ packages.

 project.M3.boundaries.for.CMAQ - function(
   database='state', # see `?M3::get.map.lines.M3.proj`
   units='m',# or 'km': see `?M3::get.map.lines.M3.proj`
   extents.fp,   # path to extents file
   extents,  # raster::extent object
   LCC.parallels=c(33,45), # LCC standard parallels: see 
 https://github.com/TomRoche/cornbeltN2O/wiki/AQMEII-North-American-domain#wiki-EPA
   CRS   # see `sp::CRS`
 ) {

   library(M3)
   ## Will replace raw LCC map's coordinates with:
   metadata.coords.IOAPI.list - M3::get.grid.info.M3(extents.fp)
   metadata.coords.IOAPI.x.orig - metadata.coords.IOAPI.list$x.orig
   metadata.coords.IOAPI.y.orig - metadata.coords.IOAPI.list$y.orig
   metadata.coords.IOAPI.x.cell.width - 
 metadata.coords.IOAPI.list$x.cell.width
   metadata.coords.IOAPI.y.cell.width - 
 metadata.coords.IOAPI.list$y.cell.width

   library(maps)
   map.lines - M3::get.map.lines.M3.proj(
 file=extents.fp, database=database, units=m)
   # dimensions are in meters, not cells. TODO: take argument
   map.lines.coords.IOAPI.x -
 (map.lines$coords[,1] - metadata.coords.IOAPI.x.orig)
   map.lines.coords.IOAPI.y -
 (map.lines$coords[,2] - metadata.coords.IOAPI.y.orig)
   map.lines.coords.IOAPI - 
 cbind(map.lines.coords.IOAPI.x, map.lines.coords.IOAPI.y)

   # # start debugging
   # class(map.lines.coords.IOAPI)
   # # [1] matrix
   # summary(map.lines.coords.IOAPI)
   # #  map.lines.coords.IOAPI.x map.lines.coords.IOAPI.y
   # #  Min.   : 283762Min.   : 160844   
   # #  1st Qu.:26502441st Qu.:1054047   
   # #  Median :3469204Median :1701052   
   # #  Mean   :3245997Mean   :1643356   
   # #  3rd Qu.:43009693rd Qu.:2252531   
   # #  Max.   :4878260Max.   :2993778   
   # #  NA's   :168NA's   :168 
   # #   end debugging

   # Note above is not zero-centered, like our extents:
   # extent : -2556000, 2952000, -1728000, 186  (xmin, xmax, ymin, ymax)
   # So gotta add (xmin, ymin) below.

   ## Get LCC state map
   # see 
 http://stackoverflow.com/questions/14865507/how-to-display-a-projected-map-on-an-rlatticelayerplot
   map.IOAPI - maps::map(
 database=state, projection=lambert, par=LCC.parallels, plot=FALSE)
   #  parameters to lambert: ^
   #  see mapproj::mapproject
   map.IOAPI$x - map.lines.coords.IOAPI.x + extents.xmin
   map.IOAPI$y - map.lines.coords.IOAPI.y + 

Re: [R] Decomposing a List

2013-04-25 Thread David Winsemius

On Apr 25, 2013, at 7:53 AM, Bert Gunter wrote:

 Well, what you really want to do is convert the list to a matrix, and
 it can be done directly and considerably faster than with the
 (implicit) looping of sapply:
 
 f1 - function(l)sapply(l,[,1)
 f2 - function(l)matrix(unlist(l),nr=2)
 l - 
 strsplit(paste(sample(LETTERS,1e6,rep=TRUE),sample(1:10,1e6,rep=TRUE),sep=+),+,fix=TRUE)

Consider this alternative:

L = list( c(A1,B1), c(A2,B2), c(A3,B3) )
simplify2array(L)
 [,1] [,2] [,3]
[1,] A1 A2 A3
[2,] B1 B2 B3

-- 
David.

 

 ## Then you get these results:
 
 system.time(x1 - f1(l))
   user  system elapsed
   1.920.011.95
 system.time(x2 - f2(l))
   user  system elapsed
   0.060.020.08
 system.time(x2 - f2(l)[1,])
   user  system elapsed
0.1 0.0 0.1
 identical(x1,x2)
 [1] TRUE
 
 
 Cheers,
 Bert
 
 
 
 
 
 
 On Thu, Apr 25, 2013 at 3:32 AM, Ted Harding ted.hard...@wlandres.net wrote:
 Thanks, Jorge, that seems to work beautifully!
 (Now to try to understand why ... but that's for later).
 Ted.
 
 On 25-Apr-2013 10:21:29 Jorge I Velez wrote:
 Dear Dr. Harding,
 
 Try
 
 sapply(L, [, 1)
 sapply(L, [, 2)
 
 HTH,
 Jorge.-
 
 
 
 On Thu, Apr 25, 2013 at 8:16 PM, Ted Harding 
 ted.hard...@wlandres.netwrote:
 
 Greetings!
 For some reason I am not managing to work out how to do this
 (in principle) simple task!
 
 As a result of applying strsplit() to a vector of character strings,
 I have a long list L (N elements), where each element is a vector
 of two character strings, like:
 
  L[1] = c(A1,B1)
  L[2] = c(A2,B2)
  L[3] = c(A3,B3)
  [etc.]
 
 From L, I wish to obtain (as directly as possible, e.g. avoiding
 a loop) two vectors each of length N where one contains the strings
 that are first in the pair, and the other contains the strings
 which are second, i.e. from L (as above) I would want to extract:
 
  V1 = c(A1,A2,A3,...)
  V2 = c(B1,B2,B3,...)
 
 Suggestions?
 
 With thanks,
 Ted.
 
 -
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Date: 25-Apr-2013  Time: 11:16:46
 This message was sent by XFMail
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 -
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Date: 25-Apr-2013  Time: 11:31:57
 This message was sent by XFMail
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 -- 
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 
 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] advert: courses in R use, programming in Seattle

2013-04-25 Thread Thomas Lumley
There are three courses in R at the Summer Institute for Statistics
Genetics, in Seattle this July, ranging from completely introductory to
advanced programming.

The intermediate and advanced courses are taught by me and Ken Rice, the
(new) introductory course by Ken and Tim Thornton.

More information at http://www.biostat.washington.edu/suminst/sisg/schedule


  -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decomposing a List

2013-04-25 Thread Bert Gunter
Well...

WIth the same list,l,as before:

 system.time(x3 - simplify2array(l))
   user  system elapsed
   2.110.052.20
 system.time(x2 - f2(l)) ## the matrix(unlist(...))  one
   user  system elapsed
   0.110.000.11
 identical(x2,x3)
[1] TRUE

So kind of a big difference if you care about efficiency...
(and I can't remember all those specialized functions, anyway!)

-- Bert

On Thu, Apr 25, 2013 at 8:53 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Apr 25, 2013, at 7:53 AM, Bert Gunter wrote:

 Well, what you really want to do is convert the list to a matrix, and
 it can be done directly and considerably faster than with the
 (implicit) looping of sapply:

 f1 - function(l)sapply(l,[,1)
 f2 - function(l)matrix(unlist(l),nr=2)
 l - 
 strsplit(paste(sample(LETTERS,1e6,rep=TRUE),sample(1:10,1e6,rep=TRUE),sep=+),+,fix=TRUE)

 Consider this alternative:

 L = list( c(A1,B1), c(A2,B2), c(A3,B3) )
 simplify2array(L)
  [,1] [,2] [,3]
 [1,] A1 A2 A3
 [2,] B1 B2 B3

 --
 David.



 ## Then you get these results:

 system.time(x1 - f1(l))
   user  system elapsed
   1.920.011.95
 system.time(x2 - f2(l))
   user  system elapsed
   0.060.020.08
 system.time(x2 - f2(l)[1,])
   user  system elapsed
0.1 0.0 0.1
 identical(x1,x2)
 [1] TRUE


 Cheers,
 Bert






 On Thu, Apr 25, 2013 at 3:32 AM, Ted Harding ted.hard...@wlandres.net 
 wrote:
 Thanks, Jorge, that seems to work beautifully!
 (Now to try to understand why ... but that's for later).
 Ted.

 On 25-Apr-2013 10:21:29 Jorge I Velez wrote:
 Dear Dr. Harding,

 Try

 sapply(L, [, 1)
 sapply(L, [, 2)

 HTH,
 Jorge.-



 On Thu, Apr 25, 2013 at 8:16 PM, Ted Harding 
 ted.hard...@wlandres.netwrote:

 Greetings!
 For some reason I am not managing to work out how to do this
 (in principle) simple task!

 As a result of applying strsplit() to a vector of character strings,
 I have a long list L (N elements), where each element is a vector
 of two character strings, like:

  L[1] = c(A1,B1)
  L[2] = c(A2,B2)
  L[3] = c(A3,B3)
  [etc.]

 From L, I wish to obtain (as directly as possible, e.g. avoiding
 a loop) two vectors each of length N where one contains the strings
 that are first in the pair, and the other contains the strings
 which are second, i.e. from L (as above) I would want to extract:

  V1 = c(A1,A2,A3,...)
  V2 = c(B1,B2,B3,...)

 Suggestions?

 With thanks,
 Ted.

 -
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Date: 25-Apr-2013  Time: 11:16:46
 This message was sent by XFMail

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 -
 E-Mail: (Ted Harding) ted.hard...@wlandres.net
 Date: 25-Apr-2013  Time: 11:31:57
 This message was sent by XFMail

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius
 Alameda, CA, USA




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Input Chinese characters not correctly echoed in ESS

2013-04-25 Thread lcn
I had this weird encoding issue for my Emacs and R environment. Display of
Chinese characters are all good with my .Rprofile setting
Sys.setlocale(LC_ALL,zh_CN.utf-8); except the echo of input ones.

 linkTexts[5]
  font
使用帮助
 functionNotExist()
错误: 没有functionNotExist这个函数
 fire - 你好
 fire
[1]   

As we can see, Chinese characters contained in the vector linkTexts,
Chinese error messages, and input Chinese characters all can be perfectly
shown, yet the echo of input characters were only shown as blank
placeholders.

sessionInfo() is here, which is as expected given the
Sys.setlocale(LC_ALL,zh_CN.utf-8); setting:

 sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

locale:
[1] zh_CN.utf-8/zh_CN.utf-8/zh_CN.utf-8/C/zh_CN.utf-8/C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods
base

other attached packages:
[1] XML_3.96-1.1

loaded via a namespace (and not attached):
[1] compiler_2.15.2 tools_2.15.2

And I have no locale settings in the .Emacs file.

To me, this seems to be an Emacs encoding issue, but I just don't know how
to correct it. Any idea or suggestion? Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting started in parallel computing on a windows OS

2013-04-25 Thread Benjamin Caldwell
Thanks for this martin. I'll start retooling and let you know how it goes.

Ben Caldwell
Graduate fellow
On Apr 24, 2013 4:34 PM, Martin Morgan mtmor...@fhcrc.org wrote:

 On 04/24/2013 02:50 PM, Benjamin Caldwell wrote:

 Dear R help,

 I've what I think is a fairly simple parallel problem, and am getting
 bogged down in documentation and packages for much more complex
 situations.

 I have a big matrix  (30^5,5]. I have a function that will act on each row
 of that matrix sequentially and output the 'best' result from the whole
 matrix (it compares the result from each row to the last and keeps the
 'better' result). I would like to divide that first large matrix into
 chunks equal to the number of cores I have available to me, and work
 through each chunk, then output the results from each chunk.

 I'm really having trouble making head or tail of how to do this on a
 windows machine - lots of different false starts on several different
 packages now. Basically, I have the function, and I can of course easily
 divide the matrix into chunks. I just need a way to process each chunk
 in parallel (other than opening new R sessions for each core manually).

 Any help much appreciated - after two days of trying to get this to work
 I'm pretty burnt out.


 Hi Ben -- in your code from this morning you had a function

 fitting - function(ndx.grd=two,dt.grd=**one,ind.vr='ind',rsp.vr='res') {
 ## ... setup
 for(i in 1:length(ndx.grd[,1])){
 ## ... do work
 }
 ## ... collate results
 }

 that you're trying to run in parallel. Obviously the ## ... represent
 lines I've removed. When you say something like

 y - foreach(icount(length(two))) %dopar% fitting()

 its saying that you want to run fitting() length(two) times. So you're
 actually doing the same thing length(two) times, whereas you really want to
 divide the work thats inside fitting() into chunks, and do those on
 separate cores!

 Conceptually what you'd like to do is

 fit_one - function(idx, ndx.grd, dt.grd, ind.vr, rsp.vr) {
 ## ... do work on row idx _ONLY_
 }

 and then evaluate with

 ## ... setup
 y -
   foreach (idx = icount(nrow(two)) %dopar% one_fit(idx, two, one, ind,
 res)
 ## ... collate

 so that fit_one fits just one of your combinations. foreach will worry
 about distributing the work. Make sure that fit_one works first, before
 trying to run this in parallel; your use of try(), trying to fit different
 data types (character, integer, numeric) into a matrix rather than
 data.frame, and the type coercions all indicate that you're fighting with R
 rather than working with it.

 Hope that helps,

 Martin


 Thanks

 *Ben Caldwell*

 [[alternative HTML version deleted]]

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Computational Biology / Fred Hutchinson Cancer Research Center
 1100 Fairview Ave. N.
 PO Box 19024 Seattle, WA 98109

 Location: Arnold Building M1 B861
 Phone: (206) 667-2793


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transferring R to another computer, R_HOME_DIR

2013-04-25 Thread lcn
Well, to my understanding, you planned to rsync the original compiled
folder from one machine to somewhere on another machine, and work with it.
Then how about create a file link on the second machine for /usr/lib64/R?
Or maybe I misunderstand your purpose?


On Thu, Apr 25, 2013 at 5:57 PM, Saptarshi Guha saptarshi.g...@gmail.comwrote:

 Hello,

 I was looking at the R (installed on RHEL6) shell script and saw
 R_HOME_DIR=/usr/lib64/R. Nowhere (and I could have got it wrong) does
 it read in the environment value R_HOME_DIR. I have the need to rsync
 the entire folder below /usr/lib64/R to another computer into another
 directory location. Without changing the R shell script, how can i
 force it read in R_HOME_DIR?

 Or maybe i misunderstood the bash source?

 (Note, i cannot recompile on target machine)

 Cheers
 Saptarshi

 1. I also realize Rscript will not work (i think path is hard coded in the
 source)

 Beginning of /usr/lib64/R/bin/R

 R_HOME_DIR=/usr/lib64/R
 if test ${R_HOME_DIR} = /usr/lib64/R; then
case linux-gnu in
linux*)
  run_arch=`uname -m`
  case $run_arch in
 x86_64|mips64|ppc64|powerpc64|sparc64|s390x)
   libnn=lib64
   libnn_fallback=lib
 ;;
 *)
   libnn=lib
   libnn_fallback=lib64
 ;;
  esac
  if [ -x /usr/${libnn}/R/bin/exec/R ]; then
 R_HOME_DIR=/usr/lib64/R
  elif [ -x /usr/${libnn_fallback}/R/bin/exec/R ]; then
 R_HOME_DIR=/usr/lib64/R
  ## else -- leave alone (might be a sub-arch)
  fi
  ;;
   esac
 fi

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.