date:20110623


Folks,

the relevant thing you have to remember is: All the stuff must be valid 
R syntax (with few additional functions as mention in the ?plotmath help 
file). Knowing that it is obvious where additional operators are required.


Best,
Uwe Ligges


On 23.06.2011 02:56, Bryan Hanson wrote:

Thanks to both David and Sarah. I'm glad I asked, as I had tried some of
the combos Sarah suggested and observed the same behavior, which puzzled
me. David, thanks for reminding me about ~ as that is a different way to
get a space into the string. I just don't use plotmath often enough to
become decent at it. Thanks again. Bryan

On Jun 22, 2011, at 8:49 PM, David Winsemius wrote:



On Jun 22, 2011, at 8:10 PM, Bryan Hanson wrote:


Hello R Masters and the Rest of Us:

The first of these works fine, the 2nd is accepted but too literal
(the %-% is shown in the plot label and in the wrong position).
The 3rd throws and error due to unexpected SPECIAL. Would someone
recommend a way to format this? I want the two phrases connected by a
right arrow.

TIA, these things always elude me. Bryan
***
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University

xlab1 -expression(paste(Phase Angle , phi,  Neat-O))
xlab2 - expression(paste(treatment: low stress, high stress, sep
= %-%))
xlab3 - expression(paste(treatment: low stress, %-%, high stress))

plot(1:10, main = xlab1)
plot(1:10, main = xlab2)


Doesn't seem that %-% works without flanking terms

xlab3 - expression(treatment*:~low~stress %-% high~stress)
plot(1, main=xlab3)

Or:

xlab3 - expression(treatment: low stress %-% high~stress)
plot(1, main=xlab3)

--
David Winsemius, MD
West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with winbugs code


1. This list is about R rather than WinBUGS.
2. The attachment did not come through the list anyway.

Best,
Uwe Ligges


On 22.06.2011 09:29, nita yalina wrote:

Good afternoon sir,
i'm a student in Indonesia who study technology management, i need to review
the software i'v been develop, i was told to use bayesian SEM because i
don't have large sample. i don't know much about statistics but i try to
make a code via winbugs.. and a i got a problem. here i attach my winbugs
code and my model. could you hep me to find what's wrong with my code? thank
you very much for your answer



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with mclapply -- losing output/data


Janet,

1. you posted to the R-help mailing list rather than to Elizabeth,
2. if posting to a mailing list, please quote the original thread 
otherwise we do not know what you are talking about if we removed the 
prior part of the thread.


Uwe Ligges



On 23.06.2011 04:22, Janet Young wrote:

Hi Elizabeth,

I just found your thread after experiencing a similar problem (I was also
using some IRanges/GenomicRanges functions).  You've probably figured it
out by this time - I'm actually curious to know what you found?

I think I've tracked it down in my case, where a small minority of the
elements I was trying to lapply/mclapply over caused a genuine error
(perhaps in your case a genuine NULL answer - is that possible?).  When
I used lapply that error was very obvious - it came up on the screen.
But when I used mclapply it looked like the function completed fine -
it wasn't until later that I saw the error was captured as a string
object in the relevant slot of  the list object returned by mclapply.
I should build in some checking if I want to persist with using
mclapply.

I wonder if something similar is going on in your case?

Janet

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Saved EPS does not match screen when using bquote(.(i))




On 23.06.2011 07:14, Dennis Murphy wrote:

Hi:

As Uwe suggested...

pdf('testgraph.pdf')
layout( matrix( 1:2 , nrow=2 ) )
for ( i in 1:2 ) {
   plot( 0 , 0 , xlab=bquote(mu[.(i)]) )
  }
dev.off()

postscript('testgraph.ps')
layout( matrix( 1:2 , nrow=2 ) )
for ( i in 1:2 ) {
   plot( 0 , 0 , xlab=bquote(mu[.(i)]) )
  }
dev.off()

png('testgraph.png')
layout( matrix( 1:2 , nrow=2 ) )
for ( i in 1:2 ) {
   plot( 0 , 0 , xlab=bquote(mu[.(i)]) )
  }
dev.off()

The three graphs look the same (although the PS graph is rotated to
landscape while the other two are portrait).


Yes, thanks, and ?postscript will show arguments such as 'horizontal' 
and 'paper' that will help to make them really similar.


Best,
Uwe


 The main point is that

mu_1 and mu_2 show up correctly in the two panels in all three graphs
(at least on my viewers).

The following thread from last January describes some of the problems
that certain viewers have with Greek letters, which appear to be
viewer and platform dependent:

http://r-project.markmail.org/search/?q=pdf%20incorrect#query:pdf%20incorrect+page:2+mid:egmb6utulrxgcznw+state:results

I'm guessing that I've seen about a half dozen or so similar posts in
this forum over the past year and a half, so you can check the list
archives for related problems.

HTH,
Dennis

On Wed, Jun 22, 2011 at 8:07 PM, John Kruschkejohnkrusc...@gmail.com  wrote:

Here's a fairly minimal-case example in which the saved EPS does not match
the screen. The error comes when using bquote(.(i)) instead of bquote(1), as
demonstrated by the two minimally different cases below. Very strange. Any
clues as to why?

# begin ---

# Version A. X axis labels have subscripts as constants. EPS is correct.
windows()
layout( matrix( 1:2 , nrow=2 ) )
plot( 0 , 0 , xlab=bquote(mu[1]) )
plot( 0 , 0 , xlab=bquote(mu[2]) )
savePlot( file=SavePlotTestA.eps , type=eps ) # Axis labels are correct
in EPS.

# Version B. X axis labels have subscripts as variable index. EPS is wrong!
windows()
layout( matrix( 1:2 , nrow=2 ) )
for ( i in 1:2 ) {
  plot( 0 , 0 , xlab=bquote(mu[.(i)]) )
}
savePlot( file=SavePlotTestB.eps , type=eps ) # X-AXIS OF PLOT 1 IS
WRONG IN EPS.

#-- end -

Thanks!

John K. Kruschke, Professor
http://www.indiana.edu/%7Ekruschke/DoingBayesianDataAnalysis/


2011/6/22 Uwe Liggeslig...@statistik.tu-dortmund.de




On 22.06.2011 13:50, John Kruschke wrote:


The error happens when using the savePlot() command, like this:
savePlot( file=TestSavePlot.eps , type=eps )
savePlot( file=TestSavePlot.jpg , type=jpg )



Well, plot directly into a device, for postscript:

postscript(estSavePlot.eps, additionalArguments .)
plot(1:10)
dev.off()

Uwe Ligges


  The images in the two saved files are not the same, with the JPG being

correct but the EPS being wrong.

When you suggest starting separate devices explicitly, what do you mean?
(I've skimmed through the results of ??device, but can't make sense of
it.)
Thank you!

John K. Kruschke, Professor


2011/6/22 Uwe 
Liggeslig...@statistik.tu-**dortmund.delig...@statistik.tu-dortmund.de




  I guess you use the menu to save the plots from your Windows device into

files rather than starting separate devices explicitly?
If so, please use explicit calls to the devices and everything happens as
you define it.

Uwe Ligges




On 22.06.2011 04:31, John Kruschke wrote:

  When I save a particular plot as JPG or BMP, the saved image is an

accurate
copy of that plot.
But when I save the same plot as EPS or PDF, some text elements are
incorrectly altered.
An example is attached. Notice in the top middle and top right panels,
the
x-axis labels have correct subscript 1 in the JPG, but incorrect
subscript
2
in the EPS.

I'm using R 2.13.0 on Windows 7.

Any clues regarding the source of this error and its solution would be
appreciated. For example, are there EPS/PDF device drivers that need to
be
separately updated?

Many thanks.

John K. Kruschke, Professor
http://www.indiana.edu/%7Ekruschke/DoingBayesianDataAnalysis/
htt**p://www.indiana.edu/%**7Ekruschke/**DoingBayesianDataAnalysis/http://www.indiana.edu/%7Ekruschke/DoingBayesianDataAnalysis/










__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-helphttps://stat.ethz.ch/mailman/**listinfo/r-help
https://stat.**ethz.ch/mailman/listinfo/r-**helphttps://stat.ethz.ch/mailman/listinfo/r-help




PLEASE do read the posting guide http://www.R-project.org/**
posting-guide.htmlhttp://www.**R-project.org/posting-guide.**htmlhttp://www.R-project.org/posting-guide.html




and provide commented, minimal, self-contained, reproducible code.








[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

Re: [R] please help for mgcv package

2011-06-23 Thread Simon Wood

In the 'inst/scripts' directory of package 'gamair' there are files 
containing the R code for each chapter of the book (questionable layout 
uncorrected, but typo free, at least!).


Simon


On 21/06/11 16:27, pigpigmeow wrote:

i read a book from WOOD, there's an example which is talking about the
pollutant.

library(gamair)
library(mgcv)
y-gam(death~s(time,bs=cr,k=200)+s(pm10median,bs=cr)+s(so2median,bs=cr)+s(o3median,bs=cr)+s(tmpd,bs=cr),data=chicago,family=Possion)

lag.sum-function(a,10,11)
{n-length(a)
b-rep(0,n-11)
for(i in 0:(11-10))
b-b+a[(i+1):(n-11+i)]

b}

death-chicago$death[4:5114]
time-chicago$time[4:5114]
o3-lag.sum(chicago$o3median,0,3)
tmp-lag.sum(chicago$tmpd,0,3)
pm10-lag.sum(log(chicago$pm10median+40),0,3)
so2-lag.sum(log(chicago$so2median+10),0,)

I don't know what is the script (Bold font ) used for..

and it shows Error: unexpected numeric constant in lag.sum-function(a,10
, why?

anyone can answer me?

--
View this message in context: 
http://r.789695.n4.nabble.com/please-help-for-mgcv-package-tp3614485p3614485.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603   http://people.bath.ac.uk/sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Subtracting TimeStamps

2011-06-23 Thread sumit gupta

Hi All,,

I am new to R and having a problem dealing with timestamps.

I have 2 columns in my table . Both of these have timestamps value in
format  06/22/11 05:34 PM . I want to calculate the difference in these 2
columns ( in minutes) and save it as a 3rd column.

Can anyone help ?

Thanks
Sumit

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Removing rows of zeros from a matrix



On Jun 2, 2011, at 11:35 AM, Petr Savicky wrote:


On Thu, Jun 02, 2011 at 11:23:28AM -0400, Jim Silverton wrote:

Hi,
Can someone tell me how to remove rows of zeros from a matrix?
For example if I have the following matrix,

0 0
0 1
2 8
0 0
4 56

I should end up with
0 1
2 8
4 56


Hi.

Try the following

 a - matrix(c(0, 0, 2, 0, 4, 0, 1, 8, 0, 56), ncol=2)
 a[rowSums(a != 0) != 0, ]


To avoid removing rows where non-zero elements do sum to 0 one could  
use the only slightly longer test that first converts a to logical:


 a - matrix(c(1, 0, 2, 0, 4, -1, 1, 8, 0, 56), ncol=2)

 a[ rowSums(a==0) != ncol(a), ]
 [,1] [,2]
[1,]1   -1
[2,]01
[3,]28
[4,]4   56

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] MST dissimilarity

2011-06-23 Thread Ondřej Mikula

Dear R-helpers,
I need to quantify dissimilarity of two minimum spanning trees,
specifically dissimilarity of their topologies. (They connect the same
objects but they are calculated from different sets of variables.)
Are you aware of any R-function doing this?
Best regards
Ondrej Mikula

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subtracting TimeStamps



On Jun 23, 2011, at 7:26 AM, sumit gupta wrote:


Hi All,,

I am new to R and having a problem dealing with timestamps.

I have 2 columns in my table . Both of these have timestamps value in
format  06/22/11 05:34 PM .


If these are Date or DateTime objects you can use:

?difftime   # with attention to the units argument
?DateTimeClasses

If they are not yet in a correct class you need to first use:

?strptime


I want to calculate the difference in these 2
columns ( in minutes) and save it as a 3rd column.


When examples are provided 

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Time-series analysis with treatment effects - statistical approach

2011-06-23 Thread Mike Marchywka










 From: rvarad...@jhmi.edu
 To: marchy...@hotmail.com; jmo...@student.canterbury.ac.nz; 
 r-help@r-project.org
 Subject: RE: [R] Time-series analysis with treatment effects - statistical 
 approach
 Date: Thu, 23 Jun 2011 02:59:19 +
 
 If you have any specific features of the time series of soil moisture, you 
 could either model that or directly estimate it and test for differences in 
 the 4 treatments.  If you do not have any such specific considerations, you 
 might want to consider some nonparametric approaches such as functional data 
 analysis, in particular  functional principal components analysis (fPCA) 
 might be relevant.  You could also consider semiparametric methods. For 
 example, take a look at the SemiPar package.  
 
 Ravi.

I guess just playing with it while waiting for other code to finish, I'd be 
curious if you had
any controlled tests such as impulse response- what did treatment do when you 
held
at constant temp and humidity and illumination in stll air after single burst 
of rain? 
If you were pursing the model approach, quick look suggests qualitatitve rather 
than
just quantitative effects - in one case looks like linear or biphasic dry out 
dynamics, others
seem to just fall off of cliff. 

Objective of course matters too, if you are trying to sell this to farmers, 
maybe a plot of
moisture for each treatment against control would help. I just did that after 
averaging over sensors
and it may be a reasonable analysis for cost effectiveness if you can translate 
moisture into
dollars. Now you would still need to put error bars on comparisons and use 
words carefully etc
but that approach may be more important than getting at dynamics. I dunno.
Consider that in fact maybe all you care about is peaks, if too dry for one day
kills the crop then that is what you want to focus the analysis on etc etc etc.






 
 From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf 
 of Mike Marchywka [marchy...@hotmail.com]
 Sent: Wednesday, June 22, 2011 9:31 PM
 To: jmo...@student.canterbury.ac.nz; r-help@r-project.org
 Subject: Re: [R] Time-series analysis with treatment effects - statistical 
 approach
 
  Date: Wed, 22 Jun 2011 17:21:52 -0700
  From: jmo...@student.canterbury.ac.nz
  To: r-help@r-project.org
  Subject: Re: [R] Time-series analysis with treatment effects - statistical 
  approach
 
  Hi Mike, here's a sample of my data so that you get an idea what I'm working
  with.
 
 Thanks, data helps make statements easier to test :)  I'm quite
 busy at moment but I will try to look during dead time.
 
 
  http://r.789695.n4.nabble.com/file/n3618615/SampleDataSet.txt
  SampleDataSet.txt
 
  Also, I've uploaded an image showing a sample graph of daily soil moisture
  by treatment. The legend shows IP, IP+, PP, PP+ which are the 4 treatments.
  Also, I've included precipitation to show the soil moisture response to
  precip.
 
 Personally I'd try to write a simple physical model or two and see which 
 one(s)
 fit best. It shouldn't be too hard to find sources and sinks of water and 
 write
 a differential equation with a few parameters.  There are probably online
 lecture notes that cover this or related examples. You probably suspect a
 mode of action for the treatments, see if that is consistent with observed 
 dyanmics.
 You may need to go get temperature and cloud data but it may or may not
 be worth it.
 
 
  http://r.789695.n4.nabble.com/file/n3618615/MeanWaterPrecipColour2ndSeasonOnly.jpeg
 
  I have used ANOVA previously, but I don't like it for 2 reasons. The first
  is that I have to average away all of the interesting variation. But mainly,
 
 There are  a number of assumptions that go into that to make it useful. If
 you are just drawing samples from populations of identical independent things
 great but here I would look at things related to non-stationary statistics of
 time series.
 
  it becomes quite cumbersome to do a separate ANOVA for each day (700+ days)
  or even each week (104 weeks).
 
 I discovered a way to do repetitive tasks that can be concisely specified 
 using
 something called a computer.  Writing loops is pretty easy, don't give up
 due to cumbersomeness. Also, you could try a few simple things like plotting
 difference charts ( plot treatment minus control for example).
 
 If you approach this purely empirically, there are time series packages
 and maybe the econ/quant financial analysts would have some thoughts
 that wouldn't be well known in your field.
 
 
 
  Thanks for your help,
  -Justin
 


- - - - - -
 
Mike Marchywka | V.P. Technology
 
415-264-8477
marchy...@phluant.com
 
Online Advertising and Analytics for Mobile
http://www.phluant.com
 
 




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read

Re: [R] numerical integration and 'non-finite function value' error

2011-06-23 Thread Adan_Seb

Here is a self-contained example of my problem. 

set.seed(100)
x = rbeta(100, 10.654, 10.439)
# So the shape parameters and the exteremes are
a = 10.654
b = 10.439
xmax = 1
xmin = 0
# Using the non-standardized form (as in my application and this shouldn't
make any difference) of the 
# Beta density function, I specify the  integrand (i.e., xf(x)) as
integrand = function(x) {x*((1/beta(a,b))*((x^(a - 1)*(xmax - x)^(b-1)) /
xmax^(a + b - 1)))}
 
# Say I want to integrate in the range (0, 0.45) and then in (0.45, Inf)
# In (0, 0.45)
 
integrate(integrand, lower = 0, upper = 0.45)$val
[1] 0.1176079

# In (0.45, Inf)
 
integrate(integrand, lower = 0.45, upper = Inf)$val
Error in integrate(integrand, lower = 0.45, upper = Inf) : 
  non-finite function value

The same thing happens when I integrate f(x) only.
Thanks again.


--
View this message in context: 
http://r.789695.n4.nabble.com/numerical-integration-and-non-finite-function-value-error-tp3618486p3619761.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] AIC() vs. mle.aic() vs. step()?

The packages is wle.

I'll put together some code that shows the behavior I'm talking about,
and send it to the list.

Alexandra

On Thu, 2011-06-23 at 13:51 +0200, Rubén Roa wrote: 
 I don't find the mle.aic function. Thus it does not ship with R and it's in 
 some contributed package.
 What package is that?
 If you had asked for help providing minimal, self-contained, reproducible 
 code, you'd have realized that you need to tell people what package you are 
 using.
 
 ___
  
 
 Dr. Rubén Roa-Ureta
 AZTI - Tecnalia / Marine Research Unit
 Txatxarramendi Ugartea z/g
 48395 Sukarrieta (Bizkaia)
 SPAIN
 
  
 
  -Mensaje original-
  De: r-help-boun...@r-project.org 
  [mailto:r-help-boun...@r-project.org] En nombre de Alexandra Thorn
  Enviado el: miércoles, 22 de junio de 2011 22:38
  Para: r-help@r-project.org
  Asunto: [R] AIC() vs. mle.aic() vs. step()?
  
  I know this a newbie question, but I've only just started 
  using AIC for model comparison and after a bunch of different 
  keyword searches I've failed to find a page laying out what 
  the differences are between the AIC scores assigned by AIC() 
  and mle.aic() using default settings.  
  
  I started by using mle.aic() to find the best submodels, but 
  then I wanted to also be able to make comparisons with a 
  couple of submodels that were nowhere near the top, so I 
  started calculating AIC values using AIC().  What I found was 
  that not only the scores, but also the ranking of the models 
  was different.  I'm not sure if this has to do with the fact 
  that mle.aic() scores are based on the full model, or some 
  sort of difference in penalties, or something else.
  
  Could anybody enlighten me as to the differences between 
  these functions, or how I can use the same scoring system to 
  find the best models and also compare to far inferior models?
  
  Failing that, could someone point me to an appropriate 
  resource that might help me understand?
  
  Thanks in advance,
  Alexandra
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rms package - problems with fit.mult.impute

2011-06-23 Thread Lina Hellström

Hi!
Does anyone know how to do the test for goodness of fit of a logistic model (in 
rms package) after running fit.mult.impute?

I am using the rms and Hmisc packages to do a multiple imputation followed by a 
logistic regression model using lrm.
Everything works fine until I try to run the test for goodness of fit: 
residuals(type=c(gof))
One needs to specify y=T and x=T in the fit. But I get a warning message when I 
do that with fit.multiple.impute.

a-aregImpute(~med.hist.err+ 
med.discr+newLiving+No.drugs+Days.categ+Los+Age+Ward+Sex, n.impute=20, 
nk=0,data=med.err)
ddist-datadist(Age,No.drugs,Days.categ, Sex, Living, Ward)
options(datadist=ddist)

fmi-fit.mult.impute(med.hist.err~Age+No.drugs+Days.categ+Sex+Living+Ward, 
fitter=lrm, x=T, y=T,a,data=med.err)
Error in 1:n.impute : NA/NaN argument
In addition: Warning message:
In 1:n.impute : numerical expression has 18 elements: only the first used

It works to do the fit.mult.impute without x and y=T but then I get the 
following warning message when running residuals
gof-residuals(fmi, type=c(gof))
Error in residuals.lrm(fmi, type = c(gof)) :
  you did not specify y=T in the fit

It was no problem to do the goodness of fit test when I ran the lrm on my 
complete data set without multiple imputation and fit.mult.impute.
model.lrm-lrm(med.hist.err~Age+No.drugs+Days +Sex+Living+Ward, x=TRUE, y=TRUE)
gof-residuals(model.lrm, type=c(gof))

Thanks
Lina
_
PhD student
Linnaeus University
Sweden




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] The R Journal Vol. 3/1 now published

2011-06-23 Thread Heather Turner

Dear All,

The first issue of the third volume of The R Journal is now available at
http://journal.r-project.org/current.html. 

Thanks to everyone involved.

Heather

--
Editor in chief
heather.tur...@r-project.org

___
r-annou...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] numerical integration and 'non-finite function value' error



On Jun 23, 2011, at 8:55 AM, Adan_Seb wrote:


Here is a self-contained example of my problem.

set.seed(100)
x = rbeta(100, 10.654, 10.439)
# So the shape parameters and the exteremes are
a = 10.654
b = 10.439
xmax = 1
xmin = 0
# Using the non-standardized form (as in my application and this  
shouldn't

make any difference) of the
# Beta density function, I specify the  integrand (i.e., xf(x)) as
integrand = function(x) {x*((1/beta(a,b))*((x^(a - 1)*(xmax -  
x)^(b-1)) /

xmax^(a + b - 1)))}

# Say I want to integrate in the range (0, 0.45) and then in (0.45,  
Inf)

# In (0, 0.45)

integrate(integrand, lower = 0, upper = 0.45)$val
[1] 0.1176079

# In (0.45, Inf)


Right. integrand() is NaN outside [0,1]

 integrand(1:100)
  [1]   0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN  
NaN NaN NaN NaN NaN
 [22] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN  
NaN NaN NaN NaN NaN
 [43] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN  
NaN NaN NaN NaN NaN
 [64] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN  
NaN NaN NaN NaN NaN

 [85] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN


integrate(integrand, lower = 0.45, upper = Inf)$val
Error in integrate(integrand, lower = 0.45, upper = Inf) :
 non-finite function value

The same thing happens when I integrate f(x) only.
Thanks again.


--
View this message in context: 
http://r.789695.n4.nabble.com/numerical-integration-and-non-finite-function-value-error-tp3618486p3619761.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [example code] RE: AIC() vs. mle.aic() vs. step()?

Ok, here's some example code showing how I get different output for AIC
vs. mle.aic().  Now that I've taken another look at the independent
variables, I'm wondering whether missing values in one of the variables
might be what is messing me up.  I'm going to see if the behavior
changes when I remove those...

Alexandra

#Code with outputs
R require(wle)
R xA # 1st independent variable (categorical)
  [1] Diffuse Diffuse Diffuse Diffuse Diffuse Diffuse Diffuse Diffuse
Diffuse
[10] Diffuse Diffuse RingDiffuse Diffuse RingDiffuse Diffuse
Diffuse
[19] Diffuse Diffuse RingDiffuse Diffuse Diffuse Diffuse Diffuse
Diffuse
[28] Diffuse Diffuse RingRingDiffuse Diffuse RingRing
Ring   
[37] Diffuse Diffuse RingRingRingRingRingRing
Diffuse
[46] Other   RingRingRingRingRingOther   Ring
Ring   
[55] RingRingRingRingRingRingRingDiffuse
Diffuse
[64] Diffuse RingRingRingDiffuse Diffuse Diffuse Diffuse
Diffuse
[73] Diffuse Diffuse Diffuse Other   RingRingRingRing
Diffuse
[82] Diffuse Diffuse Diffuse RingRingRingRingRing
Diffuse
[91] Other   Other   RingRingRingOther   RingOther
Diffuse
[100] Diffuse Diffuse RingRing   
Levels: Diffuse Other Ring
R x5 # 2nd independent variable
  [1]  35.1890163  22.8565556  15.2969944   9.6002241  25.0393843
21.1797882
  [7]   9.2677660  14.5228280   6.6982274   5.7889657  21.4854297
20.5942436
[13]  20.2180106   0.4442017   5.0414991  26.9849474  14.7613970
10.3045834
[19]  13.4192478  13.9074085   6.7219989  13.2569404  18.1492698
8.9814628
[25]  14.2575003  21.8982503   8.5661574  15.3434996   7.4060632
10.2824613
[31]  23.4777018  35.3389594  51.5448186   6.9571801  23.3166747
35.2280399
[37]  53.3812646  44.7933630  25.5658796   9.6980968   2.9003139
4.8073814
[43]   6.9274067   8.6178642  43.9578503   0.000  44.1995269
14.6878355
[49]   5.6385462   0.000  21.1687124  20.5669418   0.000
0.000
[55]  28.4924849   8.7184163  18.8744437  20.9748315  21.3849539
163.1436925
[61]  10.8565582   9.9297861   0.000   0.000  41.9369100
121.7625948
[67]  13.5709398  20.1040412  14.1449650   8.2172524  10.1649988
19.5981176
[73]  20.3028117  17.0104638  12.6129991   8.2051932   6.4293587
22.1598564
[79]  13.9703385  23.0206302  15.2590230  14.4778824   2.4819054
21.8293460
[85]  25.1515167  32.1050850  12.5154914  11.6927538   9.4048632
38.4559899
[91]  53.1959167  14.4917170  10.2548528   8.8227194  12.8573515
10.0589965
[97]  12.8868929   9.6626724   5.9826061   3.2581190  13.4467376
8.8065840
[103]  17.7734493
 x15 # 3rd independent variable
  [1]  1.69924629 -1.63414400  0.71415169  4.17480342  1.52512663
1.73541068
  [7] -5.47498002  0.95681283 -1.48092555  1.51101949 -2.25838766
2.12958863
[13]  1.43795703 -4.48003373 -3.65963009 -0.76346388 -2.44019863
1.32552847
[19]  1.89863804  1.80655970 -0.74175682  1.30112633 -1.06424643
-1.47852202
[25]  0.09035915  NA  NA  1.82385292 -0.15308708
1.04685322
[31]  2.45599032  1.36474093 -2.39863477 -0.21220447 -2.50255033
-1.92296430
[37] -0.24577578 -1.96756216  0.43349997  0.88459859 -0.12755905
2.31771322
[43] -1.21846731  1.75082992 -3.02346893 -4.15582445  1.09946460
4.30008522
[49]  4.37542383  NA -1.93641862 -0.01919492 -2.39609318
-3.12228102
[55]  0.48804606 -1.42886437 -3.52078266  3.22115286  0.87942540
-0.29385365
[61]  0.40030867  0.84382607 -0.14445408 -0.61903527  NA
1.53158894
[67] -1.01595045  0.18857375 -1.24703875 -0.53766035 -0.43305094
1.30035414
[73]  0.08256647 -0.01008154 -1.89151834  0.60161181  1.38339048
1.70782208
[79]  0.48995599  NA  0.71774340  NA  0.35578308
-1.30038021
[85]  0.18170942 -0.76999772 -0.52860127 -0.58713905  2.45770818
-3.79345760
[91] -0.73700348  1.85916858  0.48523489 -2.24404921 -3.71691741
-0.80525820
[97]  0.20768561 -0.05588210  NA -0.50332833  0.70407465
-0.57391160
[103] -1.11740646
 y1 #response variable
  [1] 0.11736407 0.12793015 0.06627390 0.03385292 0.05111586 0.12896867
  [7] 0.21030113 0.10661115 0.02321079 0.06035170 0.17966075 0.22120809
[13] 0.16367033 0.07062699 0.11563063 0.62809888 0.13571557 0.14366535
[19] 0.16453117 0.04030618 0.29904079 0.13865458 0.25814464 0.09636693
[25] 0.14262893 0.12619897 0.15919200 0.10713175 0.18137740 0.37961763
[31] 0.16831734 0.02425770 0.12793015 0.23174790 0.16384251 0.41976893
[37] 0.12498691 0.18960957 0.33873792 0.19594614 0.44510411 0.45554491
[43] 0.70821663 0.20739951 0.07828510 0.07393444 0.12290867 0.22614130
[49] 0.49742825 0.04013179 0.58127117 0.05216166 0.27597288 0.14090123
[55] 0.22120809 0.49090375 0.33216113 0.03437621 0.12031011 0.10261893
[61] 0.58141318 0.06244214 0.03594604 0.17966075 0.09345085 0.43887815
[67] 0.51929244 0.20501885 0.04663966 0.33104604 0.28841287 0.26924687
[73] 0.29495726 0.23675230 0.33385065 0.02814909 0.25281753 0.21240608
[79] 0.15204576 0.18288671 0.32867804 0.19813360 0.17379109

[R] Confidence interval from resampling

2011-06-23 Thread Adriana Bejarano

Dear R gurus,

I have the following code, but I still not know how to estimate and extract
confidence intervals (95%CI) from resampling.

Thanks!

~Adriana

#data
penta-c(770,729,640,486,450,410,400,340,306,283,278,260,253,242,240,229,201,198,190,186,180,170,168,151,150,148,147,125,117,110,107,104,85,83,80,74,70,66,54,46,45,43,40,38,10)
x-log(penta+1)
plot(ecdf(x), ylab=Probability, xlab=Concentration (Ln+1))

x.wei-fitdistr(x,weibull)
x.wei
 shapescale
  6.7291685   5.3769965
 (0.7807718) (0.1254696)
xwei.shape - x.wei$estimate[[1]]
xwei.scale -  x.wei$estimate[[2]]

x.wei-fitdistr(x,weibull)
x.wei
xwei.shape - x.wei$estimate[[1]]
xwei.scale -  x.wei$estimate[[2]]
curve(pweibull(x, shape=xwei.shape, scale = xwei.scale,lower.tail=TRUE,
log.p=FALSE), add=TRUE,col='green',lwd=3)

#draw random numbers from a weibull distribution 100 times with
shape=xwei.shape, scale = xwei.scale
draw - lapply(1:100, function(.x){
out-rweibull(x, shape=xwei.shape, scale = xwei.scale)
})
newx- data.frame(draw)

colnames(newx)-paste(x, 1:100, sep = )
newmat-data.matrix(newx)

# matrix of coefficients
rownum=2
colnum=100
ResultMat-matrix(NA, ncol=colnum, nrow=rownum)
rownum2=45
colnum2=100
ResultMat2-matrix(NA, ncol=colnum2, nrow=rownum2)

#loop through each column in the source matrix
for (i in 1:100)
{
sel_col-newmat[col(newmat)==i]
  {ResultMat[,i]-coef(fitdistr(sel_col,weibull))}
 xwei.shape- ResultMat[1,i]
   xwei.scale- ResultMat[2,i]
 curve(pweibull(x, shape=xwei.shape, scale=xwei.scale, lower.tail=TRUE,
log.p = FALSE), add=TRUE, col='grey',lwd=0.5)
 ResultMat2[,i]-pweibull(x, shape=xwei.shape, scale =
xwei.scale,lower.tail=TRUE, log.p=FALSE)
}

## convert dataframe to numeric
MatOut- as.matrix(ResultMat2)
mode(MatOut) - numeric

# initiate variables
mm-ml-mu-rep(0,length(MatOut[,1]))

# mean and upper/lower quantiles
for(i in 1:length(MatOut[,1])){
 mm[i]- mean(MatOut[i,])
 ml[i]- quantile(MatOut[i,], 0.025, na.rm=TRUE)
 mu[i]- quantile(MatOut[i,], 0.975, na.rm=TRUE)
}
#lines(x, mm, col=black)
lines(x, ml, col=blue, lwd=2)
lines(x, mu, col=blue, lwd=2)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [cleaned code] RE: AIC() vs. mle.aic() vs. step()?


On Thu, 2011-06-23 at 09:29 -0400, Alexandra Thorn wrote:
 Ok, here's some example code showing how I get different output for
AIC
 vs. mle.aic().  Now that I've taken another look at the independent
 variables, I'm wondering whether missing values in one of the
variables
 might be what is messing me up.  I'm going to see if the behavior
 changes when I remove those...
 

Okay, here's the code with dput() used to present the various objects.
Thanks to the list for being so patient with me... it's been quite
educational.

Thanks in advance,
Alexandra

Code:
R require(wle)
R dput(xA)
structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 1L, 
1L, 3L, 1L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
3L, 3L, 1L, 1L, 3L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 
2L, 3L, 3L, 3L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
1L, 1L, 1L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 3L, 
3L, 3L, 3L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 1L, 2L, 2L, 3L, 
3L, 3L, 2L, 3L, 2L, 1L, 1L, 1L, 3L, 3L), .Label = c(Diffuse, 
Other, Ring), class = factor)
R dput(x5)
c(35.1890162754395, 22.8565556431035, 15.296994438186,
9.60022407772812, 
25.0393843292946, 21.179788208, 9.26776603463063, 14.5228279661883, 
6.6982273554755, 5.7889657235105, 21.4854296564891, 20.5942435860781, 
20.2180106449331, 0.44420165807875, 5.041499147412, 26.984947359545, 
14.7613969969327, 10.304583446995, 13.4192477726851, 13.90740846636, 
6.721998863216, 13.25694036483, 18.1492698335532, 8.9814627576195, 
14.2575003028425, 21.8982502817969, 8.5661573887, 15.343499557995, 
7.4060631990625, 10.2824613451941, 23.4777018427811, 35.3389594363836, 
51.5448185920973, 6.9571800684925, 23.3166747093435, 35.2280399322705, 
53.3812645912466, 44.7933630466069, 25.5658796310335, 9.6980968165235, 
2.90031387090862, 4.80738140821225, 6.927406749722, 8.61786424398488, 
43.957850260725, 0, 44.1995269203482, 14.68783550262, 5.63854620095413, 
0, 21.1687123966326, 20.566941833529, 0, 0, 28.4924849319605, 
8.7184162712155, 18.8744437360889, 20.9748315239075, 21.3849539280062, 
163.143692522173, 10.85655822755, 9.92978608605625, 0, 0,
41.9369100379775, 
121.762594814280, 13.570939755438, 20.1040411710892, 14.1449650049318, 
8.2172523975435, 10.16499876975, 19.598117628078, 20.3028116584013, 
17.0104638219038, 12.612999143628, 8.20519315482388, 6.42935872078125, 
22.1598563909594, 13.9703385210014, 23.0206302023242, 15.25902295115, 
14.4778823661717, 2.4819054257875, 21.8293459510672, 25.151516683063, 
32.105084991422, 12.5154914474453, 11.6927538156488, 9.40486317871687, 
38.4559898615062, 53.195916748074, 14.4917169976215, 10.2548528385015, 
8.82271943808338, 12.8573514676201, 10.0589964580665, 12.886892914765, 
9.6626724052155, 5.98260608673, 3.25811900139, 13.446737566015, 
8.80658397675, 17.773449287436)
R dput(x15)
c(1.69924629406401, -1.63414400065288, 0.714151689343318,
4.17480342154949, 
1.52512663197893, 1.73541067946363, -5.47498002151169,
0.956812825760668, 
-1.48092554972038, 1.51101949018443, -2.25838766176389,
2.12958862888441, 
1.43795702627435, -4.48003372542488, -3.65963008576897,
-0.763463882139697, 
-2.44019862561235, 1.32552846648453, 1.89863804289907,
1.80655970149808, 
-0.741756823200407, 1.30112633095768, -1.06424642846912,
-1.47852202054490, 
0.090359152072348, NA, NA, 1.82385291704612, -0.153087078076393, 
1.0468532207338, 2.45599032439301, 1.36474092834838, -2.39863477181754, 
-0.212204468662908, -2.50255033079852, -1.92296430369566,
-0.245775784395867, 
-1.96756216156693, 0.433499968438238, 0.884598593578297,
-0.127559050278120, 
2.31771322353091, -1.21846730709075, 1.7508299240518,
-3.02346893141966, 
-4.15582444612729, 1.09946459784029, 4.30008521664531,
4.37542383384967, 
NA, -1.93641861765076, -0.019194921394532, -2.39609317657158, 
-3.12228102462318, 0.488046064498046, -1.42886436846636,
-3.52078266098328, 
3.22115286286252, 0.879425403143162, -0.293853650273392,
0.400308672754849, 
0.843826073923569, -0.144454076182464, -0.619035270434771, NA, 
1.53158893613932, -1.01595045420127, 0.188573746980020,
-1.24703875463314, 
-0.53766035430668, -0.433050941330375, 1.30035413662748,
0.0825664730349873, 
-0.0100815443036547, -1.89151834308193, 0.601611806130933,
1.38339048228375, 
1.70782208107344, 0.489955991643127, NA, 0.717743402714073, NA, 
0.355783083720979, -1.30038021268004, 0.181709422709264,
-0.769997723552683, 
-0.528601269320360, -0.587139047162164, 2.45770817832288,
-3.79345760049497, 
-0.737003476707607, 1.85916858045961, 0.485234889001515,
-2.24404921428853, 
-3.71691740913278, -0.805258199659559, 0.207685613867357,
-0.0558821002122282, 
NA, -0.503328331764907, 0.704074652205563, -0.573911596976014, 
-1.11740646296423)
R dput(y1)
c(0.117364072525805, 0.127930151301644, 0.066273900401,
0.0338529181312498, 
0.0511158613502366, 0.128968673883822, 0.210301133239691,
0.10661115427526, 
0.0232107944450872, 0.0603516951698553, 0.179660748593193,
0.221208092790247, 
0.163670330934813, 0.0706269859311667,

Re: [R] Confidence interval from resampling



On Jun 23, 2011, at 9:44 AM, Adriana Bejarano wrote:


Dear R gurus,

I have the following code, but I still not know how to estimate and  
extract

confidence intervals (95%CI) from resampling.



If you have a distribution of values, say resamp.stat, of a  
statistic from a properly performed resampling operation you can  
extract and display easily the 5th and 95th percentiles.


CI.stat - quantile(resamp.stat, c(0.05, 0.95) )
CI.stat

Note: I do not think that 100 replications would generally be  
sufficient for final work, although its probably acceptable for testing.


Your code as posted initially threw a bunch of errors since you did  
not include library(MASS), but fixing that fairly obvious problem  
shows that you can draw a nice plot. However, it remains unclear what  
statistic of what distribution you desire to assess. Mean, median, ...  
of what?


I do not think the error that arose on my machine from the wrapped  
text here:


#draw random numbers from a weibull distribution 100 times with
 ... shape=xwei.shape, scale = xwei.scale   - error

. was causing any problem, but there were a bunch of warnings that  
ought to be investigated:


Right after the loop I see ten of these:
Warning messages:
1: In dweibull(x, shape, scale, log) : NaNs produced

--
David


Thanks!

~Adriana

#data
penta- 
c 
(770,729,640,486,450,410,400,340,306,283,278,260,253,242,240,229,201,198,190,186,180,170,168,151,150,148,147,125,117,110,107,104,85,83,80,74,70,66,54,46,45,43,40,38,10 
)

x-log(penta+1)
plot(ecdf(x), ylab=Probability, xlab=Concentration (Ln+1))

x.wei-fitdistr(x,weibull)
x.wei
shapescale
 6.7291685   5.3769965
(0.7807718) (0.1254696)
xwei.shape - x.wei$estimate[[1]]
xwei.scale -  x.wei$estimate[[2]]

x.wei-fitdistr(x,weibull)
x.wei
xwei.shape - x.wei$estimate[[1]]
xwei.scale -  x.wei$estimate[[2]]
curve(pweibull(x, shape=xwei.shape, scale =  
xwei.scale,lower.tail=TRUE,

log.p=FALSE), add=TRUE,col='green',lwd=3)

#draw random numbers from a weibull distribution 100 times with
shape=xwei.shape, scale = xwei.scale
draw - lapply(1:100, function(.x){
out-rweibull(x, shape=xwei.shape, scale = xwei.scale)
})
newx- data.frame(draw)

colnames(newx)-paste(x, 1:100, sep = )
newmat-data.matrix(newx)

# matrix of coefficients
rownum=2
colnum=100
ResultMat-matrix(NA, ncol=colnum, nrow=rownum)
rownum2=45
colnum2=100
ResultMat2-matrix(NA, ncol=colnum2, nrow=rownum2)

#loop through each column in the source matrix
for (i in 1:100)
   {
   sel_col-newmat[col(newmat)==i]
 {ResultMat[,i]-coef(fitdistr(sel_col,weibull))}
xwei.shape- ResultMat[1,i]
  xwei.scale- ResultMat[2,i]
curve(pweibull(x, shape=xwei.shape, scale=xwei.scale, lower.tail=TRUE,
log.p = FALSE), add=TRUE, col='grey',lwd=0.5)
ResultMat2[,i]-pweibull(x, shape=xwei.shape, scale =
xwei.scale,lower.tail=TRUE, log.p=FALSE)
}

## convert dataframe to numeric
MatOut- as.matrix(ResultMat2)
mode(MatOut) - numeric

# initiate variables
mm-ml-mu-rep(0,length(MatOut[,1]))

# mean and upper/lower quantiles
for(i in 1:length(MatOut[,1])){
mm[i]- mean(MatOut[i,])
ml[i]- quantile(MatOut[i,], 0.025, na.rm=TRUE)
mu[i]- quantile(MatOut[i,], 0.975, na.rm=TRUE)
}
#lines(x, mm, col=black)
lines(x, ml, col=blue, lwd=2)
lines(x, mu, col=blue, lwd=2)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to delete file - http://CannotDeleteFile.net

2011-06-23 Thread Mark Finger

Have you ever run into a situation where you wanted to delete a file,
but Windows simply wouldn’t allow you to do it? Personally, these
things happen to me all the time, especially when I’m at a client’s
house trying to get their machine clean of malware. Have you ever
tried deleting a locked file using common windows commands? If so,
then you’ll know that this is just not possible.

if yes, you can try special tool Long Path Tool
http://HowToDeleteFile.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RBloomberg data account for dividend

2011-06-23 Thread michalseneca

Solved :)

Just use RBloomberg with RJava and use option_names 

more in
http://www.carfield.com.hk/document/Finance/rbloomberg-manual-0-4-144.pdf

Thanks mike you did a good job :)

--
View this message in context: 
http://r.789695.n4.nabble.com/RBloomberg-data-account-for-dividend-tp3614213p3619816.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R-squared values for multiple linear regression with a matrix of multiple response variables

2011-06-23 Thread Manabu Sakamoto

Dear list,

I have a matrix Y of multiple response variables and a matrix X of
predictor variables and I would like to fit a multivariate multiple
regression model and compute the R2-value to determine the overall
proportion of variance of the response matrix Y that is explained by
the predictor matrix X.

I have been using manova(Y ~ X) to assess the significance of the
linear model. I am also using lm(Y ~ X) or lm(cbind(y1, y2, ...) ~ x1
+ x2 + x3 +) but these seem to fit separate multiple linear models
to each response variable, i.e., summary(lm_object) would return a
list of regression summaries for each response variable.

I would actually like to fit a model on the two matrices with one as
the response and the other as the predictor, and compute an R2 value
of the correlation between the two matrices. Is there a built-in
function in R that does this? If not, how can I compute an R2 value of
a correlation between two matrices?


best,
Manabu
-- 
Manabu Sakamoto, PhD
School of Earth Sciences
University of Bristol
manabu.sakam...@googlemail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem (and solution) to rle on vector with NA values

2011-06-23 Thread Cormac Long

Hello there R-help,

I'm not sure if this should be posted here - so apologies if this is the case.
I've found a problem while using rle and am proposing a solution to the issue.

Description:
I ran into a niggle with rle today when working with vectors with NA values
(using R 2.31.0 on Windows 7 x64). It transpires that a run of NA values
is not encoded in the same way as a run of other values. See the following
example as an illustration:

Example:
The example
    rv-c(1,1,NA,NA,3,3,3);rle(rv)
Returns
    Run Length Encoding
      lengths: int [1:4] 2 1 1 3
      values : num [1:4] 1 NA NA 3
not
    Run Length Encoding
      lengths: int [1:3] 2 2 3
      values : num [1:3] 1 NA 3
as I expected. This caused my code to fail later (unsurprising).

Analysis:
The problem stems from the test
         y - x[-1L] != x[-n]
in line 7 of the rle function body. In this test, NA values return logical NA
values, not TRUE/FALSE (again, unsurprising).

Resolution:
I modified the rle function code as included below. As far as I tested, this
modification appears safe. The convoluted construction of naMaskVal
should guarantee that the NA masking value is always different from
any value in the vector and should be safe regardless of the input vector
form (a raw vector is not handled since the NA values do not apply here).

rle-function (x)
{
    if (!is.vector(x)  !is.list(x))
    stop('x' must be an atomic vector)
    n - length(x)
    if (n == 0L)
    return(structure(list(lengths = integer(), values = x),
    class = rle))

     BEGIN NEW SECTION PART 1 
    naRepFlag-F
    if(any(is.na(x))){
    naRepFlag-T
    IS_LOGIC-ifelse(typeof(x)==logical,T,F)

    if(typeof(x)==logical){
    x-as.integer(x)
    naMaskVal-2
    }else if(typeof(x)==character){
    
naMaskVal-paste(sample(c(letters,LETTERS,0:9),32,replace=T),collapse=)
    }else{
    naMaskVal-max(0,abs(x[!is.infinite(x)]),na.rm=T)+1
    }

    x[which(is.na(x))]-naMaskVal
    }
     END NEW SECTION PART 1 

    y - x[-1L] != x[-n]
    i - c(which(y), n)

     BEGIN NEW SECTION PART 2 
    if(naRepFlag)
    x[which(x==naMaskVal)]-NA

    if(IS_LOGIC)
    x-as.logical(x)
     END NEW SECTION PART 2 

    structure(list(lengths = diff(c(0L, i)), values = x[i]),
    class = rle)
}

Conclusion:
I think that the proposed code modification is an improvement on the existing
implementation of rle. Is it impertinent to suggest this R-modification to the
gurus at R?

Best wishes (in flame-war trepidation),
Dr. Cormac Long.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] agrep: How to match with more than 1 substitution?

2011-06-23 Thread Niklaus Kuehnis


Hi all

I'm trying to match a numeric code to a vector of numeric codes:

a - c(12345, 12346, 12347)

agrep(12349, a, max.distance=list(substitutions=1))
# [1] 1 2 3

agrep(12399, a, max.distance=list(substitutions=2))
# integer(0)

I didn't expect the latter result as substituting two characters from 
the pattern makes the pattern identical to the vector elements.


What do I need to change to match with more than 1 substitution allowed?

Thanks in advance,
Nik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Stepwise Manova

2011-06-23 Thread Schuster, Veronika

Hello all,

I have a question on manova in R:

I'm using the function manova() from the stats package.

Is there anything like a stepwise (backward or forward) manova in R  (like 
there is for regression and anova).

When I enter:

step(Model1, data=Mydata)

R returns the message:
Error in drop1.mlm(fit, scope$drop, scale = scale, trace = trace, k = k,  : 
  no 'drop1' method for mlm models

Which tells me that there is no step, drop1 or add1 function for multilinear 
models (mlm). Correct?

My model looks like this:
Model1 - lm(cbind(Var1, Var2, Var3, Var4, Var5) ~ X, data=Mydata)

But is there another way to do a stepwise manova?

Thanks a lot for your help,

Veronika



**
This message is intended only for the use of the address...{{dropped:14}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem (and solution) to rle on vector with NA values

2011-06-23 Thread Peter Ehlers


On 2011-06-23 06:44, Cormac Long wrote:

Hello there R-help,

I'm not sure if this should be posted here - so apologies if this is the case.
I've found a problem while using rle and am proposing a solution to the issue.

Description:
I ran into a niggle with rle today when working with vectors with NA values
(using R 2.31.0 on Windows 7 x64). It transpires that a run of NA values
is not encoded in the same way as a run of other values. See the following
example as an illustration:

Example:
The example
 rv-c(1,1,NA,NA,3,3,3);rle(rv)
Returns
 Run Length Encoding
   lengths: int [1:4] 2 1 1 3
   values : num [1:4] 1 NA NA 3
not
 Run Length Encoding
   lengths: int [1:3] 2 2 3
   values : num [1:3] 1 NA 3
as I expected. This caused my code to fail later (unsurprising).

Analysis:
The problem stems from the test
  y- x[-1L] != x[-n]
in line 7 of the rle function body. In this test, NA values return logical NA
values, not TRUE/FALSE (again, unsurprising).

Resolution:
I modified the rle function code as included below. As far as I tested, this
modification appears safe. The convoluted construction of naMaskVal
should guarantee that the NA masking value is always different from
any value in the vector and should be safe regardless of the input vector
form (a raw vector is not handled since the NA values do not apply here).

rle-function (x)
{
 if (!is.vector(x)  !is.list(x))
 stop('x' must be an atomic vector)
 n- length(x)
 if (n == 0L)
 return(structure(list(lengths = integer(), values = x),
 class = rle))

  BEGIN NEW SECTION PART 1 
 naRepFlag-F
 if(any(is.na(x))){
 naRepFlag-T
 IS_LOGIC-ifelse(typeof(x)==logical,T,F)

 if(typeof(x)==logical){
 x-as.integer(x)
 naMaskVal-2
 }else if(typeof(x)==character){
 
naMaskVal-paste(sample(c(letters,LETTERS,0:9),32,replace=T),collapse=)
 }else{
 naMaskVal-max(0,abs(x[!is.infinite(x)]),na.rm=T)+1
 }

 x[which(is.na(x))]-naMaskVal
 }
  END NEW SECTION PART 1 

 y- x[-1L] != x[-n]
 i- c(which(y), n)

  BEGIN NEW SECTION PART 2 
 if(naRepFlag)
 x[which(x==naMaskVal)]-NA

 if(IS_LOGIC)
 x-as.logical(x)
  END NEW SECTION PART 2 

 structure(list(lengths = diff(c(0L, i)), values = x[i]),
 class = rle)
}

Conclusion:
I think that the proposed code modification is an improvement on the existing
implementation of rle. Is it impertinent to suggest this R-modification to the
gurus at R?

Best wishes (in flame-war trepidation),


Well, it's not worth a flame, but ...
from the help page (see 'Details'):

 Missing values are regarded as unequal to the previous value,
  even if that is also missing.

Peter Ehlers



Dr. Cormac Long.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to delete file -



On Jun 23, 2011, at 9:48 AM, Mark Finger wrote:


Have you ever run into a situation where you wanted to delete a file,
but Windows simply wouldn’t allow you to do it? Personally, these
things happen to me all the time, especially when I’m at a client’s
house trying to get their machine clean of malware. Have you ever
tried deleting a locked file using common windows commands? If so,
then you’ll know that this is just not possible.


My apologies to the list. This looked like an off-topic question ( but  
we are fairly permissive in our review policy ) when cursorily viewed  
on the moderation panel. It looks like spam when seen in full display.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to delete file -




On 23.06.2011 16:50, David Winsemius wrote:


On Jun 23, 2011, at 9:48 AM, Mark Finger wrote:


Have you ever run into a situation where you wanted to delete a file,
but Windows simply wouldn’t allow you to do it? Personally, these
things happen to me all the time, especially when I’m at a client’s
house trying to get their machine clean of malware. Have you ever
tried deleting a locked file using common windows commands? If so,
then you’ll know that this is just not possible.


My apologies to the list. This looked like an off-topic question ( but
we are fairly permissive in our review policy ) when cursorily viewed on
the moderation panel. It looks like spam when seen in full display.


No need to apologize, but time to thank you and the other volunteers for 
keeping the list free of spam!


Best,
Uwe

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem (and solution) to rle on vector with NA values

2011-06-23 Thread Nick Sabbe

Hello Cormac.

Not having thoroughly checked whether your code actually works, the behavior
of rle you describe is the one documented (check the details of ?rle) and
makes sense as the missingness could have different reasons.
As such, changing this type of behavior would probably break a lot of
existing code that is built on top of rle.

There are other peculiarities and disputabilities about some base R
functions (the order of the arguments for sample trips me every time), but
unless the argument is really strong or a downright bug, I doubt people will
be willing to change this. Perhaps making the new behavior optional (through
a new parameter na.action or similar, with the default the original
behavior) is an option?

Feel free to run your own version of rle in any case. I suggest you rename
it, though, as it may cause problems for some packages.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Cormac Long
 Sent: donderdag 23 juni 2011 15:44
 To: r-help@r-project.org
 Subject: [R] problem (and solution) to rle on vector with NA values
 
 Hello there R-help,
 
 I'm not sure if this should be posted here - so apologies if this is
 the case.
 I've found a problem while using rle and am proposing a solution to the
 issue.
 
 Description:
 I ran into a niggle with rle today when working with vectors with NA
 values
 (using R 2.31.0 on Windows 7 x64). It transpires that a run of NA
 values
 is not encoded in the same way as a run of other values. See the
 following
 example as an illustration:
 
 Example:
 The example
     rv-c(1,1,NA,NA,3,3,3);rle(rv)
 Returns
     Run Length Encoding
       lengths: int [1:4] 2 1 1 3
       values : num [1:4] 1 NA NA 3
 not
     Run Length Encoding
       lengths: int [1:3] 2 2 3
       values : num [1:3] 1 NA 3
 as I expected. This caused my code to fail later (unsurprising).
 
 Analysis:
 The problem stems from the test
          y - x[-1L] != x[-n]
 in line 7 of the rle function body. In this test, NA values return
 logical NA
 values, not TRUE/FALSE (again, unsurprising).
 
 Resolution:
 I modified the rle function code as included below. As far as I tested,
 this
 modification appears safe. The convoluted construction of naMaskVal
 should guarantee that the NA masking value is always different from
 any value in the vector and should be safe regardless of the input
 vector
 form (a raw vector is not handled since the NA values do not apply
 here).
 
 rle-function (x)
 {
     if (!is.vector(x)  !is.list(x))
     stop('x' must be an atomic vector)
     n - length(x)
     if (n == 0L)
     return(structure(list(lengths = integer(), values = x),
     class = rle))
 
      BEGIN NEW SECTION PART 1 
     naRepFlag-F
     if(any(is.na(x))){
     naRepFlag-T
     IS_LOGIC-ifelse(typeof(x)==logical,T,F)
 
     if(typeof(x)==logical){
     x-as.integer(x)
     naMaskVal-2
     }else if(typeof(x)==character){
     naMaskVal-
 paste(sample(c(letters,LETTERS,0:9),32,replace=T),collapse=)
     }else{
     naMaskVal-max(0,abs(x[!is.infinite(x)]),na.rm=T)+1
     }
 
     x[which(is.na(x))]-naMaskVal
     }
      END NEW SECTION PART 1 
 
     y - x[-1L] != x[-n]
     i - c(which(y), n)
 
      BEGIN NEW SECTION PART 2 
     if(naRepFlag)
     x[which(x==naMaskVal)]-NA
 
     if(IS_LOGIC)
     x-as.logical(x)
      END NEW SECTION PART 2 
 
     structure(list(lengths = diff(c(0L, i)), values = x[i]),
     class = rle)
 }
 
 Conclusion:
 I think that the proposed code modification is an improvement on the
 existing
 implementation of rle. Is it impertinent to suggest this R-modification
 to the
 gurus at R?
 
 Best wishes (in flame-war trepidation),
 Dr. Cormac Long.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] else problem

2011-06-23 Thread Kara Przeczek

Dear R users,
I have run into a problem using if...else and I hope you can shed some light on 
it. I am using R version 2.2.0.1.

I have the following data frame:
 head(dat2f)
 year  tot_km3y
[1,] 1964 0.1876854
[2,] 1965 0.1835116
[3,] 1966 0.1915012
[4,] 1967 0.1869758
[5,] 1968 0.2249865
[6,] 1969 0.1916011

I need to pick out the median year, and since there are an even number of data, 
I cannot use 'median' directly since it gives me a non-existent year/discharge. 
I found a way to get around that with the following:

  md - dat2f[, list(med_year = max(year[which(abs(tot_km3y - median(tot_km3y)) 
== min(abs(tot_km3y - median(tot_km3y ]), med_TotQ = median(tot_km3y))]
(I really only need the year, not the actual discharge with that year, which is 
why I left med_TotQ as the true median)

However, I have some data sets that have an odd number of data for which the 
following works perfectly:

  md -dat2f[, list(med_year = year[which(tot_km3y == median(tot_km3y))], 
med_TotQ = median(tot_km3y))]

Thus, I would like to apply the above calculations depended on the condition of:

length(dat2f$year)%%2==0

I put it all together as below:

if (length(dat2f$year)%%2==0) {
  md -dat2f[, list(med_year = max(year[which(abs(tot_km3y - median(tot_km3y)) 
== min(abs(tot_km3y - median(tot_km3y ]), med_TotQ = median(tot_km3y))]  }
else {
  md -dat2f[, list(med_year = year[which(tot_km3y == median(tot_km3y))], 
med_TotQ = median(tot_km3y))]
  }

Each individual piece works perfectly on its own, but together I get the 
following error:

 if (length(dat2f$year)%%2==0) {
+   md -dat2f[, list(med_year = max(year[which(abs(tot_km3y - 
median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]), med_TotQ = 
median(tot_km3y))] }
 else {
Error: unexpected 'else' in else
   md -dat2f[, list(med_year = year[which(tot_km3y == median(tot_km3y))], 
 med_TotQ = median(tot_km3y))]
   }
Error: unexpected '}' in   }


When I tried to look up else I got this error:

 ?else
Error: unexpected 'else' in ?else

I have used exactly the same set up with if...else in other code and it worked 
fine then. I tried to run it again, and I got the same error as above. What is 
the problem? I hope it isn't something simple and silly!

I realize that I can use the first line:

  md - dat2f[, list(med_year = max(year[which(abs(tot_km3y - median(tot_km3y)) 
== min(abs(tot_km3y - median(tot_km3y ]), med_TotQ = median(tot_km3y))]

for all data sets and it will give me the median for both odd and even-length 
data sets, but it is now about the principle; why won't the if...else work?

Thank you very much for your time!

Kara

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Loading List data into R with scan()

2011-06-23 Thread Michael Pearmain

Hi All,

I've been given a data file of the form:
1: 3,4,5,6
2:1,2,3
43: 5,7,8,9,5

and i want to read this data in as a list to create the form:
(guessing final look)
my.list
[[1]]
[1] 3 4 5 6

[[2]]
[1] 1 2 3

[[43]]
[1] 5 7 8 9 5

I can get to a stage using scan:
scan(my.data, what = character(0), quiet = TRUE)
to load
[1] 1: 3,4,5,6
[2] 2:1,2,3
[3] 43: 5,7,8,9,5

but im not sure on how next to proceed to arrange this into a list form, can
anyone offer some advise?

Thanks in advance

Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] else problem



On Jun 23, 2011, at 10:59 AM, Kara Przeczek wrote:


Dear R users,
I have run into a problem using if...else and I hope you can shed  
some light on it. I am using R version 2.2.0.1.


I have the following data frame:

head(dat2f)

year  tot_km3y
[1,] 1964 0.1876854
[2,] 1965 0.1835116
[3,] 1966 0.1915012
[4,] 1967 0.1869758
[5,] 1968 0.2249865
[6,] 1969 0.1916011

I need to pick out the median year, and since there are an even  
number of data, I cannot use 'median' directly since it gives me a  
non-existent year/discharge. I found a way to get around that with  
the following:


 md - dat2f[, list(med_year = max(year[which(abs(tot_km3y -  
median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]),  
med_TotQ = median(tot_km3y))]
(I really only need the year, not the actual discharge with that  
year, which is why I left med_TotQ as the true median)


However, I have some data sets that have an odd number of data for  
which the following works perfectly:


 md -dat2f[, list(med_year = year[which(tot_km3y ==  
median(tot_km3y))], med_TotQ = median(tot_km3y))]


Thus, I would like to apply the above calculations depended on the  
condition of:


length(dat2f$year)%%2==0

I put it all together as below:

if (length(dat2f$year)%%2==0) {
 md -dat2f[, list(med_year = max(year[which(abs(tot_km3y -  
median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]),  
med_TotQ = median(tot_km3y))]  }

else {


If this line is executed at a console session it will fail because the  
interpreter does not keep a copy of the last condition. If you moved  
the closing curley-brace to just befor the 'else', you should get the  
behavior you expect (if your other code is correct):


Try instead:

if (length(dat2f$year)%%2==0) {
 md -dat2f[, list(med_year = max(year[which(abs(tot_km3y -  
median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]),  
med_TotQ = median(tot_km3y))]

} else {
 md -dat2f[, list(med_year = year[which(tot_km3y ==  
median(tot_km3y))], med_TotQ = median(tot_km3y))]

 }


 md -dat2f[, list(med_year = year[which(tot_km3y ==  
median(tot_km3y))], med_TotQ = median(tot_km3y))]

 }

Each individual piece works perfectly on its own, but together I get  
the following error:



if (length(dat2f$year)%%2==0) {
+   md -dat2f[, list(med_year = max(year[which(abs(tot_km3y -  
median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]),  
med_TotQ = median(tot_km3y))] }

else {

Error: unexpected 'else' in else
 md -dat2f[, list(med_year = year[which(tot_km3y ==  
median(tot_km3y))], med_TotQ = median(tot_km3y))]

 }

Error: unexpected '}' in   }




When I tried to look up else I got this error:


?else

Error: unexpected 'else' in ?else


Try instead:

?else

--
David


I have used exactly the same set up with if...else in other code and  
it worked fine then. I tried to run it again, and I got the same  
error as above. What is the problem? I hope it isn't something  
simple and silly!


I realize that I can use the first line:

 md - dat2f[, list(med_year = max(year[which(abs(tot_km3y -  
median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]),  
med_TotQ = median(tot_km3y))]


for all data sets and it will give me the median for both odd and  
even-length data sets, but it is now about the principle; why won't  
the if...else work?


Thank you very much for your time!

Kara


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotmath: unexpected SPECIAL

2011-06-23 Thread Sarah Goslee

On Thu, Jun 23, 2011 at 6:30 AM, Uwe Ligges
lig...@statistik.tu-dortmund.de wrote:
 Folks,

 the relevant thing you have to remember is: All the stuff must be valid R
 syntax (with few additional functions as mention in the ?plotmath help
 file). Knowing that it is obvious where additional operators are required.

 Best,
 Uwe Ligges


Actually, I think that may be the source of the confusion.

I was thinking about it in terms of *labels*, and there's no intrinsic
reason that an axis label or other text caption has to have x and y
arguments, even if it otherwise might be an operator. A right arrow,
as the original querent asked about, could have other uses in a label
than connecting two items. It simply didn't occur to me that it could
*only* be used that way, though in other contexts that would make
perfect sense.

It is only possible to use a binary operator through plotmath, even
though the result is a label, and not to talk about a binary operator.
That's the missing conceptual bit, at least for me.

Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loading List data into R with scan()




On 23.06.2011 16:39, Michael Pearmain wrote:

Hi All,

I've been given a data file of the form:
1: 3,4,5,6
2:1,2,3
43: 5,7,8,9,5

and i want to read this data in as a list to create the form:
(guessing final look)
my.list
[[1]]
[1] 3 4 5 6

[[2]]
[1] 1 2 3

[[43]]
[1] 5 7 8 9 5

I can get to a stage using scan:
scan(my.data, what = character(0), quiet = TRUE)
to load
[1] 1: 3,4,5,6
[2] 2:1,2,3
[3] 43: 5,7,8,9,5



I don't understand why you want 40 empty list elements, but here is what 
you asked for (not optimized, just hacked in few seconds):


temp - strsplit(d, :)
num - as.numeric(sapply(temp, [[, 1))
L - vector(mode = list, length = max(num))
for(i in seq_along(temp)){
L[[num[i]]] - as.numeric(unlist(strsplit(temp[[i]][2], ,)))
}
L

Uwe Ligges




but im not sure on how next to proceed to arrange this into a list form, can
anyone offer some advise?

Thanks in advance

Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] else problem

2011-06-23 Thread Bert Gunter

Perhaps some additional clarification...  (???)

 if (length(dat2f$year)%%2==0) {
  md -dat2f[, list(med_year = max(year[which(abs(tot_km3y -
 median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]), med_TotQ =
 median(tot_km3y))]  }
 else {

 If this line is executed at a console session it will fail because the
 interpreter does not keep a copy of the last condition. If you moved the
 closing curley-brace to just befor the 'else', you should get the behavior
 you expect (if your other code is correct):

Well, um.. not sure if this is what you meant, but what is happening
at the console is that when you type return, the interpreter checks
for a syntactically complete statement. If it finds what has been
given to it **thus far** is, it tries to execute it (if not, it gives
a continuation character and waits for more input) and, as you said,
then starts anew to interpret the next line(s) entered, forgetting
all previous. The problem above is that the if() statement up to the
close bracket, } is syntactically complete, and so the else{ that
follows makes no sense as the beginnig of a new line to  be
interpreted.

The simplest and universal solution to this is to simply enclose the
whole conditional in { }:

{if(length ...
...
else {...}
}

This forces the interpreter to wait for the last } before it will
interpret and execute.

Hoping this clarifies rather than obfuscates.

-- Bert




 Try instead:

 if (length(dat2f$year)%%2==0) {
  md -dat2f[, list(med_year = max(year[which(abs(tot_km3y -
 median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]), med_TotQ =
 median(tot_km3y))]
 } else {
  md -dat2f[, list(med_year = year[which(tot_km3y == median(tot_km3y))],
 med_TotQ = median(tot_km3y))]
  }


  md -dat2f[, list(med_year = year[which(tot_km3y == median(tot_km3y))],
 med_TotQ = median(tot_km3y))]
  }

 Each individual piece works perfectly on its own, but together I get the
 following error:

 if (length(dat2f$year)%%2==0) {

 +   md -dat2f[, list(med_year = max(year[which(abs(tot_km3y -
 median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]), med_TotQ =
 median(tot_km3y))] }

 else {

 Error: unexpected 'else' in else

  md -dat2f[, list(med_year = year[which(tot_km3y == median(tot_km3y))],
 med_TotQ = median(tot_km3y))]
  }

 Error: unexpected '}' in   }


 When I tried to look up else I got this error:

 ?else

 Error: unexpected 'else' in ?else

 Try instead:

 ?else

 --
 David

 I have used exactly the same set up with if...else in other code and it
 worked fine then. I tried to run it again, and I got the same error as
 above. What is the problem? I hope it isn't something simple and silly!

 I realize that I can use the first line:

  md - dat2f[, list(med_year = max(year[which(abs(tot_km3y -
 median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]), med_TotQ =
 median(tot_km3y))]

 for all data sets and it will give me the median for both odd and
 even-length data sets, but it is now about the principle; why won't the
 if...else work?

 Thank you very much for your time!

 Kara

 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loading List data into R with scan()

2011-06-23 Thread Henrique Dallazuanna

l - readLines(textConnection(1: 3,4,5,6
2:1,2,3
43: 5,7,8,9,5))

On Thu, Jun 23, 2011 at 12:28 PM, Henrique Dallazuanna www...@gmail.com wrote:
 Try this:

  sapply(lapply(strsplit(l, :), strsplit, ,),
 function(x)structure(lapply(x[2], as.numeric), .Names = x[1]))

 On Thu, Jun 23, 2011 at 11:39 AM, Michael Pearmain
 michael.pearm...@gmail.com wrote:
 Hi All,

 I've been given a data file of the form:
 1: 3,4,5,6
 2:1,2,3
 43: 5,7,8,9,5

 and i want to read this data in as a list to create the form:
 (guessing final look)
 my.list
 [[1]]
 [1] 3 4 5 6

 [[2]]
 [1] 1 2 3

 [[43]]
 [1] 5 7 8 9 5

 I can get to a stage using scan:
 scan(my.data, what = character(0), quiet = TRUE)
 to load
 [1] 1: 3,4,5,6
 [2] 2:1,2,3
 [3] 43: 5,7,8,9,5

 but im not sure on how next to proceed to arrange this into a list form, can
 anyone offer some advise?

 Thanks in advance

 Mike

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loading List data into R with scan()

2011-06-23 Thread Michael Pearmain

Thanks Uwe,

The list elements was a mistake on my part, i just wanted everything before
the : to be the name of the element.
Thanks for the help, i can play around with this to get what i want.

M


2011/6/23 Uwe Ligges lig...@statistik.tu-dortmund.de



 On 23.06.2011 16:39, Michael Pearmain wrote:

 Hi All,

 I've been given a data file of the form:
 1: 3,4,5,6
 2:1,2,3
 43: 5,7,8,9,5

 and i want to read this data in as a list to create the form:
 (guessing final look)
 my.list
 [[1]]
 [1] 3 4 5 6

 [[2]]
 [1] 1 2 3

 [[43]]
 [1] 5 7 8 9 5

 I can get to a stage using scan:
 scan(my.data, what = character(0), quiet = TRUE)
 to load
 [1] 1: 3,4,5,6
 [2] 2:1,2,3
 [3] 43: 5,7,8,9,5



 I don't understand why you want 40 empty list elements, but here is what
 you asked for (not optimized, just hacked in few seconds):

 temp - strsplit(d, :)
 num - as.numeric(sapply(temp, [[, 1))
 L - vector(mode = list, length = max(num))
 for(i in seq_along(temp)){
L[[num[i]]] - as.numeric(unlist(strsplit(**temp[[i]][2], ,)))
 }
 L

 Uwe Ligges



  but im not sure on how next to proceed to arrange this into a list form,
 can
 anyone offer some advise?

 Thanks in advance

 Mike

[[alternative HTML version deleted]]

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loading List data into R with scan()

2011-06-23 Thread Henrique Dallazuanna

Try this:

 sapply(lapply(strsplit(l, :), strsplit, ,),
function(x)structure(lapply(x[2], as.numeric), .Names = x[1]))

On Thu, Jun 23, 2011 at 11:39 AM, Michael Pearmain
michael.pearm...@gmail.com wrote:
 Hi All,

 I've been given a data file of the form:
 1: 3,4,5,6
 2:1,2,3
 43: 5,7,8,9,5

 and i want to read this data in as a list to create the form:
 (guessing final look)
 my.list
 [[1]]
 [1] 3 4 5 6

 [[2]]
 [1] 1 2 3

 [[43]]
 [1] 5 7 8 9 5

 I can get to a stage using scan:
 scan(my.data, what = character(0), quiet = TRUE)
 to load
 [1] 1: 3,4,5,6
 [2] 2:1,2,3
 [3] 43: 5,7,8,9,5

 but im not sure on how next to proceed to arrange this into a list form, can
 anyone offer some advise?

 Thanks in advance

 Mike

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loading List data into R with scan()



On Jun 23, 2011, at 11:19 AM, Uwe Ligges wrote:




On 23.06.2011 16:39, Michael Pearmain wrote:

Hi All,

I've been given a data file of the form:
1: 3,4,5,6
2:1,2,3
43: 5,7,8,9,5

and i want to read this data in as a list to create the form:
(guessing final look)
my.list
[[1]]
[1] 3 4 5 6

[[2]]
[1] 1 2 3

[[43]]
[1] 5 7 8 9 5

I can get to a stage using scan:
scan(my.data, what = character(0), quiet = TRUE)
to load
[1] 1: 3,4,5,6
[2] 2:1,2,3
[3] 43: 5,7,8,9,5



I don't understand why you want 40 empty list elements, but here is  
what you asked for (not optimized, just hacked in few seconds):


temp - strsplit(d, :)
num - as.numeric(sapply(temp, [[, 1))
L - vector(mode = list, length = max(num))
for(i in seq_along(temp)){
   L[[num[i]]] - as.numeric(unlist(strsplit(temp[[i]][2], ,)))
}
L


I wondered about that too. Perhaps he would be satisfied with alpha  
indexing:


d - c( 1: 3,4,5,6, 2:1,2,3, 43: 5,7,8,9,5)
 temp - strsplit(d, :)
 num - sapply(temp, [[, 1)
 L - vector(mode = list)
 for(i in seq_along(temp)){
L[[num[i]]] - as.numeric(unlist(strsplit(temp[[i]][2], ,)))
 }

 L
$`1`
[1] 3 4 5 6

$`2`
[1] 1 2 3

$`43`
[1] 5 7 8 9 5



Uwe Ligges



but im not sure on how next to proceed to arrange this into a list  
form, can

anyone offer some advise?

Thanks in advance

Mike




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ddply to count frequency of combinations

2011-06-23 Thread Brian Diggs


On 6/22/2011 11:02 PM, Idris Raja wrote:

Brian,

I'm a bit confused about how the following line works, specifically, what is
happening in freq=length(x)? Is it just taking the length of x after it has
been summarized by different combinations x  y? I guess that must be the
case, because that gives the same result as using freq=length(y)

d1-ddply(d, .(x, y), summarize, freq=length(x))
d2-ddply(d, .(x, y), summarize, freq=length(y))


Effectively, ddply takes the dataframe (d), splits it up into multiple 
dataframes based on unique combinations of the variables (x and y), and 
calls the function (summarize) with each of the sub-dataframes in turn. 
 ddply also has the option to pass additional parameters to the 
function that is called.  In this case, that is what happens with 
freq=length(x).  Each sub-dataframe is the first argument to a call to 
summarize([sub-dataframe], freq=length(x)).


summarize, in turn, takes a dataframe and other arguments in the form of 
var=value.  It evaluates each of the values in the context of the 
dataframe (that is, column names can be used directly as variables) and 
assigns the result to the variable var.  These var's then become the 
columns of a new dataframe.


 summarize(df, freq=length(x))
  freq
19

You are right that length(y) would work just as well; since they are 
both columns in the same dataframe, they must have the same length.


(The last thing ddply does is take all the dataframes that are returned 
from the function calls and put them back together into a single 
dataframe which also includes information on which subset each 
corresponds to.)



Also, what is the significance of the periods before the second argument in
.(x, y) ?


The variables to split on can be given as quoted variables, a formula 
or character vector.  The . is a function in plyr that quotes variables 
(the first option).  The following three are identical:


ddply(df, .(x, y), summarise, freq=length(x))
ddply(df, ~x+y, summarise, freq=length(x))
ddply(df, c(x, y), summarise, freq=length(x))


Thanks for the help.


You may also benefit from reading Hadley's paper on the topic:

Hadley Wickham (2011). The Split-Apply-Combine Strategy for Data 
Analysis. Journal of Statistical Software, 40(1), 1-29. 
http://www.jstatsoft.org/v40/i01/.



On Tue, Jun 21, 2011 at 12:54 PM, Brian Diggsdig...@ohsu.edu  wrote:


On 6/21/2011 11:30 AM, Idris Raja wrote:


I have a dataframe df with two columns x and y. I want to count the number
of times a unique x, y combination occurs.

For example

x- c(1,2,3,4,5,1,2,3,4)
y- c(1,2,3,4,5,1,2,4,1)

df-as.data.frame(cbind(x, y))

#what is the correct way to use ddply for this example?
ddply(df, c('x','y', summarize, ??)

#desired output -- format and order doesn't matter
# (x, y) count
#
# (1, 1) 2
# (2, 2) 2
# (3, 3) 1
# (4, 4) 1
# (5, 5) 1
# (2, 3) 1
# (3, 4) 1
# (4, 1) 1

[[alternative HTML version deleted]]



Jorge and Dennis gave good responses that get you to the result you asked
for, but for completeness I thought I'd include some ddply versions:

ddply(d, .(x, y), summarize, freq=length(x))

This uses the summarize function you were asking about, however you can
also do it with:

ddply(d, .(x, y), nrow)

or

ddply(d, .(x, y), as.data.frame(nrow))

The latter giving a slightly nicer name (value instead of V1).

As an aside, I prefer using the summarise spelling of the function when I
do use it, because it won't clash with Hmisc::summarize.

ddply(d, .(x, y), summarise, freq=length(x))


--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health  Science University



[[alternative HTML version deleted]]


--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health  Science University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory(RAM) issues

2011-06-23 Thread Abhisek Saha

Hi Lui, Anupam and my other R-user friends,
Thanks  for your previous suggestions. As for my issue, it is clearly
RAM problem as my code is running perfectly as long as my input data
size is small and code has been refined a number of times to increase
efficiency [ by using matrix more in the context of manipulations and
reducing order of time complexity wherever possible]. Yet  it has
gotta run 2000 iterations and in each of the iterations it repeatedly
modifies  few [0.5 Mn * 15] tables

After few basic investigation it seems that the problem MAY be
resolved if 64 bit version of R is installed in 64 bit OS as it claims
to be able use potentially a  8TB RAM ( for my case I think 4 GB RAM
may serve the purpose but 2 GB is not enough) if available.I don't
have any domain knowledge in IT since I hail from a statistics
background.  I work in a company as a statistical analyst. The IT
folks of my company are ready to deploy any free solution but I need
to tell them what software needs to be installed and what all OS
configurations would be required. They normally run all of the
applications on unix servors.

So I need to know if any free 64 bit of R version can be installed in
unix servor or if not unix, may be on other servor. So hereby I
request my R-user friends to please let me know if any free 64 bit
version of R is available that can be installed on any unix servor. If
that is not available is there any other FREE  solution and if
available, how to get that and what all configuration is required.

Awaiting your replies,
Regards,
Abhisek

P.S. Please forward this mail to any other R-mailing list if you deem
it fit for any of them.

On Sat, Jun 11, 2011 at 4:38 PM, Lui ## lui.r.proj...@googlemail.com wrote:
 Hello Abhisek,

 maybe you wanna try it on just a bigger machine (I guess you are
 working at a university, so I hope you do have access to them). In
 case getting computing time or the like is a major issue, you may
 wanna try Amazon AWS: For a few rupees (about 50-100 per hour) you can
 rent pretty fast computers (20 Ghz, 8BG of RAM). You may want to try
 out the Windows version (little bit more expensive) which is easily
 accessible via remote desktop. Installing Revolution (which is free
 for academic purposes) (64 Bit Version) might give you a good start.
 Maybe its not a viable option in a long term view (pricing), but it
 might help to get a clue whether the problem can be solved on a bigger
 machine and just trying it out without a bigger hastle...

 Good luck!

 Lui

 On Sat, Jun 11, 2011 at 7:47 AM, Abhisek Saha tad...@gmail.com wrote:
 Thanks Anupam for your inputs. I believe there are two ways to
 circumvent the issue...1 making the code more efficient 1 Increasing
 the memory in some way.The reasons why I did not paste the code are 1
 It is long and using quite a number of functions that  I created 2
 Secondly my intention is not to make the code more efficient if that's
 possible. Here landing into a memory problem with 2 GB RAM is natural
 as my analysis entails 1500 simulations from huge multivariate
 distributions that change after every simulation and tomorrow I may
 have to do similar analysis with 10 million observations * 20 columns.

 In view of above I shall be needing more memory sometime later and my
 IT friends are ready to support me for that(probably with a sandbox)
 but I am not sure whether I can install probably a servor version of R
 that can be capable of working with 8GB or so RAM. So it is more of
 technical help I need and I have no clue regarding the plausibility of
 the solution mentioned( i.e. a servor version of R that is capable of
 more memory).

 Regards,
 Abhisek

 On Sat, Jun 11, 2011 at 10:10 AM, Anupam anupa...@gmail.com wrote:

 It will be helpful on this forum to use metric measures: 12 Lakh is 1.2
 million, thus your data is 1.2 million observations x 15 variables. I do not
 know the intricacies of R. You may have to wait for someone with that
 knowledge to respond.

 Including some relevant portions of error messages and code in your query
 can also help someone to respond to your message.

 Anupam.
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Abhisek Saha
 Sent: Saturday, June 11, 2011 6:25 AM
 To: r-help@r-project.org
 Subject: [R] Memory(RAM) issues

 Dear All,
 I have been working with R(desktop version) in VISTA. I have the latest
 version R 2.13.0. I have been working with few data-sets of size 12Lakh * 15
 and my code is quite computing intensive ( applying MCMC gibbs sampler on a
 posterior of 144 variables) that often runs into a memory issue such as
 memory can not allocate the vector ,full size(shows to have reached
 something like 1.5 GB) reached or something to this effect. I have a RAM of
 2GB.  I checked with the option like memory.size and it says a 64 bit R if
 sat on 64 bit windows then max memory capability is 8TB.

 Now I don't have  background to understand the

Re: [R] Loading List data into R with scan()

2011-06-23 Thread Michael Pearmain

Thanks All,

Henrique, gave me the solution is was looking for, the indexing was a
mistake on my part.

Thanks again

On 23 June 2011 16:37, David Winsemius dwinsem...@comcast.net wrote:


 On Jun 23, 2011, at 11:19 AM, Uwe Ligges wrote:



 On 23.06.2011 16:39, Michael Pearmain wrote:

 Hi All,

 I've been given a data file of the form:
 1: 3,4,5,6
 2:1,2,3
 43: 5,7,8,9,5

 and i want to read this data in as a list to create the form:
 (guessing final look)
 my.list
 [[1]]
 [1] 3 4 5 6

 [[2]]
 [1] 1 2 3

 [[43]]
 [1] 5 7 8 9 5

 I can get to a stage using scan:
 scan(my.data, what = character(0), quiet = TRUE)
 to load
 [1] 1: 3,4,5,6
 [2] 2:1,2,3
 [3] 43: 5,7,8,9,5



 I don't understand why you want 40 empty list elements, but here is what
 you asked for (not optimized, just hacked in few seconds):

 temp - strsplit(d, :)
 num - as.numeric(sapply(temp, [[, 1))
 L - vector(mode = list, length = max(num))
 for(i in seq_along(temp)){
   L[[num[i]]] - as.numeric(unlist(strsplit(**temp[[i]][2], ,)))
 }
 L


 I wondered about that too. Perhaps he would be satisfied with alpha
 indexing:

 d - c( 1: 3,4,5,6, 2:1,2,3, 43: 5,7,8,9,5)
  temp - strsplit(d, :)
  num - sapply(temp, [[, 1)
  L - vector(mode = list)

  for(i in seq_along(temp)){
L[[num[i]]] - as.numeric(unlist(strsplit(**temp[[i]][2], ,)))
  }

  L
 $`1`

 [1] 3 4 5 6

 $`2`
 [1] 1 2 3

 $`43`
 [1] 5 7 8 9 5


  Uwe Ligges



  but im not sure on how next to proceed to arrange this into a list form,
 can
 anyone offer some advise?

 Thanks in advance

 Mike




 David Winsemius, MD
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem (and solution) to rle on vector with NA values

2011-06-23 Thread Cormac Long

D'oh! Completely missed that. Definately a case or RTFMS (RTFM, Stupid).

My apologies for the spam.

Sincerely (with additional grovelling)
Cormac.

On 23 June 2011 15:59, Nick Sabbe nick.sa...@ugent.be wrote:
 Hello Cormac.

 Not having thoroughly checked whether your code actually works, the behavior
 of rle you describe is the one documented (check the details of ?rle) and
 makes sense as the missingness could have different reasons.
 As such, changing this type of behavior would probably break a lot of
 existing code that is built on top of rle.

 There are other peculiarities and disputabilities about some base R
 functions (the order of the arguments for sample trips me every time), but
 unless the argument is really strong or a downright bug, I doubt people will
 be willing to change this. Perhaps making the new behavior optional (through
 a new parameter na.action or similar, with the default the original
 behavior) is an option?

 Feel free to run your own version of rle in any case. I suggest you rename
 it, though, as it may cause problems for some packages.


 Nick Sabbe
 --
 ping: nick.sa...@ugent.be
 link: http://biomath.ugent.be
 wink: A1.056, Coupure Links 653, 9000 Gent
 ring: 09/264.59.36

 -- Do Not Disapprove




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Cormac Long
 Sent: donderdag 23 juni 2011 15:44
 To: r-help@r-project.org
 Subject: [R] problem (and solution) to rle on vector with NA values

 Hello there R-help,

 I'm not sure if this should be posted here - so apologies if this is
 the case.
 I've found a problem while using rle and am proposing a solution to the
 issue.

 Description:
 I ran into a niggle with rle today when working with vectors with NA
 values
 (using R 2.31.0 on Windows 7 x64). It transpires that a run of NA
 values
 is not encoded in the same way as a run of other values. See the
 following
 example as an illustration:

 Example:
 The example
     rv-c(1,1,NA,NA,3,3,3);rle(rv)
 Returns
     Run Length Encoding
       lengths: int [1:4] 2 1 1 3
       values : num [1:4] 1 NA NA 3
 not
     Run Length Encoding
       lengths: int [1:3] 2 2 3
       values : num [1:3] 1 NA 3
 as I expected. This caused my code to fail later (unsurprising).

 Analysis:
 The problem stems from the test
          y - x[-1L] != x[-n]
 in line 7 of the rle function body. In this test, NA values return
 logical NA
 values, not TRUE/FALSE (again, unsurprising).

 Resolution:
 I modified the rle function code as included below. As far as I tested,
 this
 modification appears safe. The convoluted construction of naMaskVal
 should guarantee that the NA masking value is always different from
 any value in the vector and should be safe regardless of the input
 vector
 form (a raw vector is not handled since the NA values do not apply
 here).

 rle-function (x)
 {
     if (!is.vector(x)  !is.list(x))
     stop('x' must be an atomic vector)
     n - length(x)
     if (n == 0L)
     return(structure(list(lengths = integer(), values = x),
     class = rle))

      BEGIN NEW SECTION PART 1 
     naRepFlag-F
     if(any(is.na(x))){
     naRepFlag-T
     IS_LOGIC-ifelse(typeof(x)==logical,T,F)

     if(typeof(x)==logical){
     x-as.integer(x)
     naMaskVal-2
     }else if(typeof(x)==character){
     naMaskVal-
 paste(sample(c(letters,LETTERS,0:9),32,replace=T),collapse=)
     }else{
     naMaskVal-max(0,abs(x[!is.infinite(x)]),na.rm=T)+1
     }

     x[which(is.na(x))]-naMaskVal
     }
      END NEW SECTION PART 1 

     y - x[-1L] != x[-n]
     i - c(which(y), n)

      BEGIN NEW SECTION PART 2 
     if(naRepFlag)
     x[which(x==naMaskVal)]-NA

     if(IS_LOGIC)
     x-as.logical(x)
      END NEW SECTION PART 2 

     structure(list(lengths = diff(c(0L, i)), values = x[i]),
     class = rle)
 }

 Conclusion:
 I think that the proposed code modification is an improvement on the
 existing
 implementation of rle. Is it impertinent to suggest this R-modification
 to the
 gurus at R?

 Best wishes (in flame-war trepidation),
 Dr. Cormac Long.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory(RAM) issues



On Jun 23, 2011, at 11:57 AM, Abhisek Saha wrote:


Hi Lui, Anupam and my other R-user friends,
Thanks  for your previous suggestions. As for my issue, it is clearly
RAM problem as my code is running perfectly as long as my input data
size is small and code has been refined a number of times to increase
efficiency [ by using matrix more in the context of manipulations and
reducing order of time complexity wherever possible]. Yet  it has
gotta run 2000 iterations and in each of the iterations it repeatedly
modifies  few [0.5 Mn * 15] tables

After few basic investigation it seems that the problem MAY be
resolved if 64 bit version of R is installed in 64 bit OS as it claims
to be able use potentially a  8TB RAM ( for my case I think 4 GB RAM
may serve the purpose but 2 GB is not enough) if available.I don't
have any domain knowledge in IT since I hail from a statistics
background.  I work in a company as a statistical analyst. The IT
folks of my company are ready to deploy any free solution but I need
to tell them what software needs to be installed and what all OS
configurations would be required. They normally run all of the
applications on unix servors.

So I need to know if any free 64 bit of R version can be installed in
unix servor or if not unix, may be on other servor. So hereby I
request my R-user friends to please let me know if any free 64 bit
version of R is available


Yes.


that can be installed on any unix servor. If
that is not available is there any other FREE  solution and if
available, how to get that and what all configuration is required.


There are a bunch of UNIX-alike platforms. Read the FAQ:

http://cran.r-project.org/doc/FAQ/R-FAQ.html#What-machines-does-R-run-on_003f

--  
David.


Awaiting your replies,
Regards,
Abhisek

P.S. Please forward this mail to any other R-mailing list if you deem
it fit for any of them.

On Sat, Jun 11, 2011 at 4:38 PM, Lui ##  
lui.r.proj...@googlemail.com wrote:

Hello Abhisek,

maybe you wanna try it on just a bigger machine (I guess you are
working at a university, so I hope you do have access to them). In
case getting computing time or the like is a major issue, you may
wanna try Amazon AWS: For a few rupees (about 50-100 per hour) you  
can
rent pretty fast computers (20 Ghz, 8BG of RAM). You may want to  
try

out the Windows version (little bit more expensive) which is easily
accessible via remote desktop. Installing Revolution (which is free
for academic purposes) (64 Bit Version) might give you a good start.
Maybe its not a viable option in a long term view (pricing), but it
might help to get a clue whether the problem can be solved on a  
bigger

machine and just trying it out without a bigger hastle...

Good luck!

Lui

On Sat, Jun 11, 2011 at 7:47 AM, Abhisek Saha tad...@gmail.com  
wrote:

Thanks Anupam for your inputs. I believe there are two ways to
circumvent the issue...1 making the code more efficient 1  
Increasing
the memory in some way.The reasons why I did not paste the code  
are 1

It is long and using quite a number of functions that  I created 2
Secondly my intention is not to make the code more efficient if  
that's
possible. Here landing into a memory problem with 2 GB RAM is  
natural

as my analysis entails 1500 simulations from huge multivariate
distributions that change after every simulation and tomorrow I may
have to do similar analysis with 10 million observations * 20  
columns.


In view of above I shall be needing more memory sometime later and  
my

IT friends are ready to support me for that(probably with a sandbox)
but I am not sure whether I can install probably a servor version  
of R

that can be capable of working with 8GB or so RAM. So it is more of
technical help I need and I have no clue regarding the  
plausibility of
the solution mentioned( i.e. a servor version of R that is capable  
of

more memory).

Regards,
Abhisek

On Sat, Jun 11, 2011 at 10:10 AM, Anupam anupa...@gmail.com wrote:


It will be helpful on this forum to use metric measures: 12 Lakh  
is 1.2
million, thus your data is 1.2 million observations x 15  
variables. I do not
know the intricacies of R. You may have to wait for someone with  
that

knowledge to respond.

Including some relevant portions of error messages and code in  
your query

can also help someone to respond to your message.

Anupam.
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
] On

Behalf Of Abhisek Saha
Sent: Saturday, June 11, 2011 6:25 AM
To: r-help@r-project.org
Subject: [R] Memory(RAM) issues

Dear All,
I have been working with R(desktop version) in VISTA. I have the  
latest
version R 2.13.0. I have been working with few data-sets of size  
12Lakh * 15
and my code is quite computing intensive ( applying MCMC gibbs  
sampler on a
posterior of 144 variables) that often runs into a memory issue  
such as

memory can not allocate the vector ,full size(shows to have reached
something like 1.5 GB) reached or

Re: [R] else problem

2011-06-23 Thread Kara Przeczek

Thank you for all your help! I did not know to use  when searching for help, 
as ?mean, etc, had always worked for me in the past. 
It makes perfect sense why 'else' was causing me the trouble the way I was 
using it. I think it was working in my other code, despite the same format, 
because it was part of a function and thus would have been executed completely 
within the function?
Cheers,
Kara

From: Bert Gunter [gunter.ber...@gene.com]
Sent: June 23, 2011 8:27 AM
To: David Winsemius
Cc: Kara Przeczek; r-help@r-project.org
Subject: Re: [R] else problem

Perhaps some additional clarification...  (???)

 if (length(dat2f$year)%%2==0) {
  md -dat2f[, list(med_year = max(year[which(abs(tot_km3y -
 median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]), med_TotQ =
 median(tot_km3y))]  }
 else {

 If this line is executed at a console session it will fail because the
 interpreter does not keep a copy of the last condition. If you moved the
 closing curley-brace to just befor the 'else', you should get the behavior
 you expect (if your other code is correct):

Well, um.. not sure if this is what you meant, but what is happening
at the console is that when you type return, the interpreter checks
for a syntactically complete statement. If it finds what has been
given to it **thus far** is, it tries to execute it (if not, it gives
a continuation character and waits for more input) and, as you said,
then starts anew to interpret the next line(s) entered, forgetting
all previous. The problem above is that the if() statement up to the
close bracket, } is syntactically complete, and so the else{ that
follows makes no sense as the beginnig of a new line to  be
interpreted.

The simplest and universal solution to this is to simply enclose the
whole conditional in { }:

{if(length ...
...
else {...}
}

This forces the interpreter to wait for the last } before it will
interpret and execute.

Hoping this clarifies rather than obfuscates.

-- Bert




 Try instead:

 if (length(dat2f$year)%%2==0) {
  md -dat2f[, list(med_year = max(year[which(abs(tot_km3y -
 median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]), med_TotQ =
 median(tot_km3y))]
 } else {
  md -dat2f[, list(med_year = year[which(tot_km3y == median(tot_km3y))],
 med_TotQ = median(tot_km3y))]
  }


  md -dat2f[, list(med_year = year[which(tot_km3y == median(tot_km3y))],
 med_TotQ = median(tot_km3y))]
  }

 Each individual piece works perfectly on its own, but together I get the
 following error:

 if (length(dat2f$year)%%2==0) {

 +   md -dat2f[, list(med_year = max(year[which(abs(tot_km3y -
 median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]), med_TotQ =
 median(tot_km3y))] }

 else {

 Error: unexpected 'else' in else

  md -dat2f[, list(med_year = year[which(tot_km3y == median(tot_km3y))],
 med_TotQ = median(tot_km3y))]
  }

 Error: unexpected '}' in   }


 When I tried to look up else I got this error:

 ?else

 Error: unexpected 'else' in ?else

 Try instead:

 ?else

 --
 David

 I have used exactly the same set up with if...else in other code and it
 worked fine then. I tried to run it again, and I got the same error as
 above. What is the problem? I hope it isn't something simple and silly!

 I realize that I can use the first line:

  md - dat2f[, list(med_year = max(year[which(abs(tot_km3y -
 median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]), med_TotQ =
 median(tot_km3y))]

 for all data sets and it will give me the median for both odd and
 even-length data sets, but it is now about the principle; why won't the
 if...else work?

 Thank you very much for your time!

 Kara

 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




--
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Confidence interval from resampling

2011-06-23 Thread Stephen Ellison

Depending on how critical the problem is, you might also want to look at the 
literature on bootstrap CI's, perhaps starting from the references in boot.ci 
in the boot package. The simple quantiles are not necessarily the most 
appropriate. For example I seem to recall that BCa intervals were the intervals 
recommended for 'general use'  by Efron and Tibshirani (Introduction to the 
Bootstrap (1993) Chapman and Hall) with ABC intervals also getting an 
honourable mention. 1993 is a long time ago, though...

S Ellison



 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius
 Sent: 23 June 2011 15:25
 To: Adriana Bejarano
 Cc: r-help@r-project.org
 Subject: Re: [R] Confidence interval from resampling
 
 
 On Jun 23, 2011, at 9:44 AM, Adriana Bejarano wrote:
 
  Dear R gurus,
 
  I have the following code, but I still not know how to estimate and 
  extract confidence intervals (95%CI) from resampling.
 
 
 If you have a distribution of values, say resamp.stat, of a 
 statistic from a properly performed resampling operation you 
 can extract and display easily the 5th and 95th percentiles.
 
 CI.stat - quantile(resamp.stat, c(0.05, 0.95) ) CI.stat
 
 Note: I do not think that 100 replications would generally be 
 sufficient for final work, although its probably acceptable 
 for testing.
 
 Your code as posted initially threw a bunch of errors since 
 you did not include library(MASS), but fixing that fairly 
 obvious problem shows that you can draw a nice plot. However, 
 it remains unclear what statistic of what distribution you 
 desire to assess. Mean, median, ...  
 of what?
 
 I do not think the error that arose on my machine from the 
 wrapped text here:
 
 #draw random numbers from a weibull distribution 100 times with
   ... shape=xwei.shape, scale = xwei.scale   - error
 
 . was causing any problem, but there were a bunch of 
 warnings that ought to be investigated:
 
 Right after the loop I see ten of these:
 Warning messages:
 1: In dweibull(x, shape, scale, log) : NaNs produced
 
 --
 David
 
  Thanks!
 
  ~Adriana
 
  #data
  penta-
  c
  
 (770,729,640,486,450,410,400,340,306,283,278,260,253,242,240,229,201,1
  
 98,190,186,180,170,168,151,150,148,147,125,117,110,107,104,85,83,80,74
  ,70,66,54,46,45,43,40,38,10
  )
  x-log(penta+1)
  plot(ecdf(x), ylab=Probability, xlab=Concentration (Ln+1))
 
  x.wei-fitdistr(x,weibull)
  x.wei
  shapescale
   6.7291685   5.3769965
  (0.7807718) (0.1254696)
  xwei.shape - x.wei$estimate[[1]]
  xwei.scale -  x.wei$estimate[[2]]
 
  x.wei-fitdistr(x,weibull)
  x.wei
  xwei.shape - x.wei$estimate[[1]]
  xwei.scale -  x.wei$estimate[[2]]
  curve(pweibull(x, shape=xwei.shape, scale = 
  xwei.scale,lower.tail=TRUE, log.p=FALSE), 
 add=TRUE,col='green',lwd=3)
 
  #draw random numbers from a weibull distribution 100 times with 
  shape=xwei.shape, scale = xwei.scale draw - lapply(1:100, 
  function(.x){ out-rweibull(x, shape=xwei.shape, scale = xwei.scale)
  })
  newx- data.frame(draw)
 
  colnames(newx)-paste(x, 1:100, sep = )
  newmat-data.matrix(newx)
 
  # matrix of coefficients
  rownum=2
  colnum=100
  ResultMat-matrix(NA, ncol=colnum, nrow=rownum)
  rownum2=45
  colnum2=100
  ResultMat2-matrix(NA, ncol=colnum2, nrow=rownum2)
 
  #loop through each column in the source matrix for (i in 1:100)
 {
 sel_col-newmat[col(newmat)==i]  
  {ResultMat[,i]-coef(fitdistr(sel_col,weibull))}
  xwei.shape- ResultMat[1,i]
xwei.scale- ResultMat[2,i]
  curve(pweibull(x, shape=xwei.shape, scale=xwei.scale, 
 lower.tail=TRUE, 
  log.p = FALSE), add=TRUE, col='grey',lwd=0.5) 
  ResultMat2[,i]-pweibull(x, shape=xwei.shape, scale = 
  xwei.scale,lower.tail=TRUE, log.p=FALSE) }
 
  ## convert dataframe to numeric
  MatOut- as.matrix(ResultMat2)
  mode(MatOut) - numeric
 
  # initiate variables
  mm-ml-mu-rep(0,length(MatOut[,1]))
 
  # mean and upper/lower quantiles
  for(i in 1:length(MatOut[,1])){
  mm[i]- mean(MatOut[i,])
  ml[i]- quantile(MatOut[i,], 0.025, na.rm=TRUE)
  mu[i]- quantile(MatOut[i,], 0.975, na.rm=TRUE) } #lines(x, mm, 
  col=black) lines(x, ml, col=blue, lwd=2) lines(x, mu, 
 col=blue, 
  lwd=2)
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 ***
This email and any

[R] split dataframe by sample()

2011-06-23 Thread Paul Tanger

Hi,

I seemingly have a simple problem, but I've spend hours reading guides 
posts on this forum and I can't seem to piece together what I need.
I have a dataframe where I want to divide it into two subsets: a sample, and
the remainder of the dataframe in a new frame.
I've tried this:

split(df, sample(nrow(df), size=100, replace=FALSE))

another way would be to make a new dataframe of my sample and (something I
can do in SQL but not R) then select rows that are NOT in the sample
dataframe.

Thanks for any help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] else problem



On Jun 23, 2011, at 12:05 PM, Kara Przeczek wrote:

Thank you for all your help! I did not know to use  when searching  
for help, as ?mean, etc, had always worked for me in the past.
It makes perfect sense why 'else' was causing me the trouble the way  
I was using it. I think it was working in my other code, despite the  
same format, because it was part of a function and thus would have  
been executed completely within the function?


Right. When done at the console or sourced, the problem will arise,  
but not when inside a function. Bert's strategy of enclosing the whole  
call in {} can be used to demonstrate at the console:


 {fntest - function() if (FALSE)  print(T)
+else  print(F) }
 fntest()
[1] F

Or you can enclose just the body:

 fntest - function() { if (FALSE)   print(T)
+  else { print(F) } }
 fntest()
[1] F
--
David



Cheers,
Kara

From: Bert Gunter [gunter.ber...@gene.com]
Sent: June 23, 2011 8:27 AM
To: David Winsemius
Cc: Kara Przeczek; r-help@r-project.org
Subject: Re: [R] else problem

Perhaps some additional clarification...  (???)


if (length(dat2f$year)%%2==0) {
md -dat2f[, list(med_year = max(year[which(abs(tot_km3y -
median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]),  
med_TotQ =

median(tot_km3y))]  }
else {


If this line is executed at a console session it will fail because  
the
interpreter does not keep a copy of the last condition. If you  
moved the
closing curley-brace to just befor the 'else', you should get the  
behavior

you expect (if your other code is correct):


Well, um.. not sure if this is what you meant, but what is happening
at the console is that when you type return, the interpreter checks
for a syntactically complete statement. If it finds what has been
given to it **thus far** is, it tries to execute it (if not, it gives
a continuation character and waits for more input) and, as you said,
then starts anew to interpret the next line(s) entered, forgetting
all previous. The problem above is that the if() statement up to the
close bracket, } is syntactically complete, and so the else{ that
follows makes no sense as the beginnig of a new line to  be
interpreted.

The simplest and universal solution to this is to simply enclose the
whole conditional in { }:

{if(length ...
...
else {...}
}

This forces the interpreter to wait for the last } before it will
interpret and execute.

Hoping this clarifies rather than obfuscates.

-- Bert





Try instead:

if (length(dat2f$year)%%2==0) {
md -dat2f[, list(med_year = max(year[which(abs(tot_km3y -
median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]),  
med_TotQ =

median(tot_km3y))]
} else {
md -dat2f[, list(med_year = year[which(tot_km3y ==  
median(tot_km3y))],

med_TotQ = median(tot_km3y))]
}


md -dat2f[, list(med_year = year[which(tot_km3y ==  
median(tot_km3y))],

med_TotQ = median(tot_km3y))]
}

Each individual piece works perfectly on its own, but together I  
get the

following error:


if (length(dat2f$year)%%2==0) {


+   md -dat2f[, list(med_year = max(year[which(abs(tot_km3y -
median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]),  
med_TotQ =

median(tot_km3y))] }


else {


Error: unexpected 'else' in else


md -dat2f[, list(med_year = year[which(tot_km3y ==  
median(tot_km3y))],

med_TotQ = median(tot_km3y))]
}


Error: unexpected '}' in   }




When I tried to look up else I got this error:


?else


Error: unexpected 'else' in ?else


Try instead:

?else

--
David


I have used exactly the same set up with if...else in other code  
and it
worked fine then. I tried to run it again, and I got the same  
error as
above. What is the problem? I hope it isn't something simple and  
silly!


I realize that I can use the first line:

md - dat2f[, list(med_year = max(year[which(abs(tot_km3y -
median(tot_km3y)) == min(abs(tot_km3y - median(tot_km3y ]),  
med_TotQ =

median(tot_km3y))]

for all data sets and it will give me the median for both odd and
even-length data sets, but it is now about the principle; why  
won't the

if...else work?

Thank you very much for your time!

Kara


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list

Re: [R] split dataframe by sample()

2011-06-23 Thread Sarah Goslee

It's very simple to do this in steps:

 # to make separate dataframes
 df - data.frame(A=1:5, B=11:15)
 df.sample - c(1,3,4)
 df[df.sample, ]
  A  B
1 1 11
3 3 13
4 4 14
 df[-df.sample, ]
  A  B
2 2 12
5 5 15

 # or a list with two components
 split(df, 1:nrow(df) %in% df.sample)
$`FALSE`
  A  B
2 2 12
5 5 15

$`TRUE`
  A  B
1 1 11
3 3 13
4 4 14

Sarah

On Thu, Jun 23, 2011 at 12:12 PM, Paul Tanger paul.tan...@colostate.edu wrote:
 Hi,

 I seemingly have a simple problem, but I've spend hours reading guides 
 posts on this forum and I can't seem to piece together what I need.
 I have a dataframe where I want to divide it into two subsets: a sample, and
 the remainder of the dataframe in a new frame.
 I've tried this:

 split(df, sample(nrow(df), size=100, replace=FALSE))

 another way would be to make a new dataframe of my sample and (something I
 can do in SQL but not R) then select rows that are NOT in the sample
 dataframe.

 Thanks for any help!

        [[alternative HTML version deleted]]



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Saved EPS does not match screen when using bquote(.(i))

2011-06-23 Thread Peter Ehlers



I think that there may be a problem with the way
bquote(), for() and savePlot() play together in
the OP's example (multiple plots on a windows device;
bquote using the loop index).

Here's a version using replayPlot():

## show mu with subscripts 4 and 9:
  x11()
  par(mfrow = c(2,1))
  for (i in c(4, 9)) {
plot(0, 0, main = bquote(mu[.(i)]))
  }

## now record and replay:
  z - recordPlot()
  replayPlot(z)
## both subscripts are now 9


Simple workarounds are:

1. As Uwe and Dennis show, for saving to file,
use the appropriate device.

2. For recalling plots (e.g. with recording turned on),
a) use substitute() instead of bquote() or
b) insert something like
   i - i
before the plot() call.

sessionInfo()
R version 2.13.0 Patched (2011-05-24 r55981)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252
[3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C
[5] LC_TIME=English_Canada.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

Peter Ehlers


On 2011-06-22 22:14, Dennis Murphy wrote:

Hi:

As Uwe suggested...

pdf('testgraph.pdf')
layout( matrix( 1:2 , nrow=2 ) )
for ( i in 1:2 ) {
   plot( 0 , 0 , xlab=bquote(mu[.(i)]) )
  }
dev.off()

postscript('testgraph.ps')
layout( matrix( 1:2 , nrow=2 ) )
for ( i in 1:2 ) {
   plot( 0 , 0 , xlab=bquote(mu[.(i)]) )
  }
dev.off()

png('testgraph.png')
layout( matrix( 1:2 , nrow=2 ) )
for ( i in 1:2 ) {
   plot( 0 , 0 , xlab=bquote(mu[.(i)]) )
  }
dev.off()

The three graphs look the same (although the PS graph is rotated to
landscape while the other two are portrait). The main point is that
mu_1 and mu_2 show up correctly in the two panels in all three graphs
(at least on my viewers).

The following thread from last January describes some of the problems
that certain viewers have with Greek letters, which appear to be
viewer and platform dependent:

http://r-project.markmail.org/search/?q=pdf%20incorrect#query:pdf%20incorrect+page:2+mid:egmb6utulrxgcznw+state:results

I'm guessing that I've seen about a half dozen or so similar posts in
this forum over the past year and a half, so you can check the list
archives for related problems.

HTH,
Dennis

On Wed, Jun 22, 2011 at 8:07 PM, John Kruschkejohnkrusc...@gmail.com  wrote:

Here's a fairly minimal-case example in which the saved EPS does not match
the screen. The error comes when using bquote(.(i)) instead of bquote(1), as
demonstrated by the two minimally different cases below. Very strange. Any
clues as to why?

# begin ---

# Version A. X axis labels have subscripts as constants. EPS is correct.
windows()
layout( matrix( 1:2 , nrow=2 ) )
plot( 0 , 0 , xlab=bquote(mu[1]) )
plot( 0 , 0 , xlab=bquote(mu[2]) )
savePlot( file=SavePlotTestA.eps , type=eps ) # Axis labels are correct
in EPS.

# Version B. X axis labels have subscripts as variable index. EPS is wrong!
windows()
layout( matrix( 1:2 , nrow=2 ) )
for ( i in 1:2 ) {
  plot( 0 , 0 , xlab=bquote(mu[.(i)]) )
}
savePlot( file=SavePlotTestB.eps , type=eps ) # X-AXIS OF PLOT 1 IS
WRONG IN EPS.

#-- end -

Thanks!

John K. Kruschke, Professor
http://www.indiana.edu/%7Ekruschke/DoingBayesianDataAnalysis/


2011/6/22 Uwe Liggeslig...@statistik.tu-dortmund.de




On 22.06.2011 13:50, John Kruschke wrote:


The error happens when using the savePlot() command, like this:
savePlot( file=TestSavePlot.eps , type=eps )
savePlot( file=TestSavePlot.jpg , type=jpg )



Well, plot directly into a device, for postscript:

postscript(estSavePlot.eps, additionalArguments .)
plot(1:10)
dev.off()

Uwe Ligges


  The images in the two saved files are not the same, with the JPG being

correct but the EPS being wrong.

When you suggest starting separate devices explicitly, what do you mean?
(I've skimmed through the results of ??device, but can't make sense of
it.)
Thank you!

John K. Kruschke, Professor


2011/6/22 Uwe 
Liggeslig...@statistik.tu-**dortmund.delig...@statistik.tu-dortmund.de




  I guess you use the menu to save the plots from your Windows device into

files rather than starting separate devices explicitly?
If so, please use explicit calls to the devices and everything happens as
you define it.

Uwe Ligges




On 22.06.2011 04:31, John Kruschke wrote:

  When I save a particular plot as JPG or BMP, the saved image is an

accurate
copy of that plot.
But when I save the same plot as EPS or PDF, some text elements are
incorrectly altered.
An example is attached. Notice in the top middle and top right panels,
the
x-axis labels have correct subscript 1 in the JPG, but incorrect
subscript
2
in the EPS.

I'm using R 2.13.0 on Windows 7.

Any clues regarding the source of this error and its solution would be
appreciated. For example, are there EPS/PDF device drivers that need to
be
separately updated?

Many thanks.

John K. Kruschke, Professor

[R] Lattice xyplot to group by two parameters

2011-06-23 Thread Guy Jett

My thanks to this mailing list and its members for their great help in the 
past.  I have yet another question per the following code and comments:

# I need individual graphs grouped by PARLABEL AND Event, with PARLABEL
#   controlling pct and lty, and Event controlling col (where Event==1 is 
green
#   and Event==2 is red). This is attempted in the xyplot parameter
#   groups = c(PARLABEL, Event). Current behavior appears to be that the
#   colors are controlled by PARLABEL along with the shape and line type.  
Some 
#   experimentation with those two parameters (reverse order, one or the other 
only)
#   suggests that the second parameter is ignored.
#
# I have also experimented with panel.superpose, subsetting, and Extended 
formula
#   interface, as appears to be suggested by the man pages, but to no avail. 
#   
#   Note that I can get the graphics layout I need with ggplot2, but that has 
its
#   own issues with secondary axis and subtitle text placement, formatting, and 
#   pagination.
#
# Example script below.  Datafile included. 

  depthlimit=c(7,-1)
  xlimit=c(0.1,125000)
  TransectOrder = c(LLR19, LLR18, LLR17)

  OffsetOrder  =  c(T, U, V, Y, Z,
A, B, C, D, E, F, G, H)

# Parse the data for these plots -
  df - read.csv(file = 5.04-r02_LTC-SE-SO-Compared.csv)
  df$Transect - factor(df$Transect, levels = TransectOrder)
  levels(df$Transect)
  df$Offset - factor(df$Offset, levels = OffsetOrder)
  levels(df$Offset)

xyplot((sbd+sed)/2 ~ Result | Offset+Transect, groups = c(PARLABEL, Event),
  as.table = TRUE,
  drop.unused.levels = FALSE, data = df,
  layout = c(13,3), type = b,
  xlim = c(0.01, 122000), ylim = depthlimit,
  pch  =c(16, 18), cex = c(1, 1), # to be controlled by PARLABEL
  lty  =c(dashed, solid), # to be controlled by PARLABEL
  col = c(green, red),# to be controlled by Event
  scales = list(alternating = 1, x = list(rot = 45, log = TRUE)))

Yours,
Guy Jett, R.G.
Project Geologist
ITSI,  A Gilbane Company
gj...@itsi.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Glmnet Variable Questions

2011-06-23 Thread Paul Bleicher

Hi all, 

I have two questions about variables in glmnet:
 
1. We are doing a logistic regression with binary outcome variable using a
set of predictors that include continuous and binary predictors(coded 0 and
1).  If the latter are centered and standardized, they will be transformed
into negative and positive numbers; when multiplied by a single beta, I
believe they will have undesirable effects.  Is there a way to standardize
only specified variables?  Alternatively, should glmnet be run with manually
centered and standardized continuous variables, binary variables coded 0 and
1, and with standardize = FALSE.
 
2. We have predictors with missing values.  We have been handling these by
creating a dummy variable for the predictor with a value of 0 if a value is
present and 1 if a value is absent.  If the model is forced to include both
the predictor and the dummy variable, the model-assigned coefficient will
effectively interpolate for the missing value.  How can I force the dummy
variable to be included in glmnet whenever the predictor variable is
included?

--
View this message in context: 
http://r.789695.n4.nabble.com/Glmnet-Variable-Questions-tp3620379p3620379.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding the levels in a data frame column

2011-06-23 Thread Greg Snow

Try levels(df$city) or unique(df$city) depending on if it is a factor (default) 
or character string.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Abraham Mathew
 Sent: Thursday, June 23, 2011 10:01 AM
 To: r-help@r-project.org
 Subject: [R] Finding the levels in a data frame column
 
 I have a data frame that looks as follows.
 
 df - data.frame(city=c(Houston, Houston, El Paso, Waco,
 Houston,
 Plano, Plano)
 
 What I want to do is get a list of the city values. Currently, when I
 run
 df$city, I get all the values.
 
 I just want to know the four cities that appear.
 So instead of:  Houston, Houston, El Paso, Waco, Houston,
 Plano,
 Plano
 
 I want: Houston, El Paso, Waco, Plano
 
 
 I'm running R 2.13 on Ubuntu 10.10
 
 
 Abraham
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Finding the levels in a data frame column

2011-06-23 Thread Abraham Mathew

I have a data frame that looks as follows.

df - data.frame(city=c(Houston, Houston, El Paso, Waco, Houston,
Plano, Plano)

What I want to do is get a list of the city values. Currently, when I run
df$city, I get all the values.

I just want to know the four cities that appear.
So instead of:  Houston, Houston, El Paso, Waco, Houston, Plano,
Plano

I want: Houston, El Paso, Waco, Plano


I'm running R 2.13 on Ubuntu 10.10


Abraham

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merging multiple data sets

2011-06-23 Thread Sarah Goslee

Hi,

 How can I best do this in R? I've looked into merge but it excludes ids that
 aren't in all 3 data sets.
You need to look a bit harder at merge(), specifically the all.x and
all.y options.

Sarah

On Thu, Jun 23, 2011 at 12:53 PM, cddesjar cddesjard...@gmail.com wrote:
 Hi,
 I am trying to merge data similar to the example data below

 dat0

 id    var1    var2    var3
 2       1         0         1
 3       1         0         1
 4       0         1         1
 5       0         1         1

 dat1

 id    var4    var5    var6
 2       1         0         1
 3       1         0         1
 6       0         1         1
 7       0         1         1

 dat2

 id    var7    var8    var9
 2       1         0         1
 5       1         0         1
 6       0         1         1
 8       0         1         1

 Basically what I'd like to do is combine these variables on id and create
 one large data frame that looks like the following.

 dat3

 id    var1    var2    var3   var4    var5   var6   var7   var8   var9
 2       1         0         1      1         0        1      1        0
 1
 3       1         0         1      1         0        1      NA     NA
 NA
 4       0         1         1      NA      NA      NA    NA     NA     NA
 5       0         1         1      NA      NA      NA    NA     NA     NA
 6      NA       NA       NA     0         1        1      0       1        1
 7      NA       NA       NA     0         1        1     NA     NA     NA
 8      NA       NA       NA     NA      NA     NA     0        1        1

 How can I best do this in R? I've looked into merge but it excludes ids that
 aren't in all 3 data sets.


-- 
Sarah Goslee
http://www.functionaldiversity.org
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merging multiple data sets

2011-06-23 Thread Dennis Murphy

Hi:

Try this:

merge(merge(dat0, dat1, by = 'id', all.x = TRUE, all.y = TRUE), dat2,
by = 'id', all.x = TRUE, all.y = TRUE)

Dennis

On Thu, Jun 23, 2011 at 9:53 AM, cddesjar cddesjard...@gmail.com wrote:
 Hi,
 I am trying to merge data similar to the example data below

 dat0

 id    var1    var2    var3
 2       1         0         1
 3       1         0         1
 4       0         1         1
 5       0         1         1

 dat1

 id    var4    var5    var6
 2       1         0         1
 3       1         0         1
 6       0         1         1
 7       0         1         1

 dat2

 id    var7    var8    var9
 2       1         0         1
 5       1         0         1
 6       0         1         1
 8       0         1         1

 Basically what I'd like to do is combine these variables on id and create
 one large data frame that looks like the following.

 dat3

 id    var1    var2    var3   var4    var5   var6   var7   var8   var9
 2       1         0         1      1         0        1      1        0
 1
 3       1         0         1      1         0        1      NA     NA
 NA
 4       0         1         1      NA      NA      NA    NA     NA     NA
 5       0         1         1      NA      NA      NA    NA     NA     NA
 6      NA       NA       NA     0         1        1      0       1        1
 7      NA       NA       NA     0         1        1     NA     NA     NA
 8      NA       NA       NA     NA      NA     NA     0        1        1

 How can I best do this in R? I've looked into merge but it excludes ids that
 aren't in all 3 data sets.


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Merging-multiple-data-sets-tp3620431p3620431.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Ranking submodels by AIC (more general question)

Here's a more general question following up on the specific question I
asked earlier:

Can anybody recommend an R command other than mle.aic() (from the wle
package) that will give back a ranked list of submodels?  It seems like
a pretty basic piece of functionality, but the closest I've been able to
find is stepAIC(), which as far as I can tell only gives back the best
submodel, not a ranking of all submodels.

Thanks in advance,
Alexandra

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] two panel figure using barplot2

2011-06-23 Thread Adrienne Keller

I am trying to make a two panel figure using barplot2 in gplots. When  
I use par(mfrow=c(1,2)), two panels are created but both barplots are  
overlaid on top of one another in the first panel and the second panel  
is left blank.  Am I running into problems because I of some  
complexities of gplots? Is there a simple way to make a two panel  
figure with two barplots side by side using barplot2?


Thank you,

Adrienne Keller
Graduate Student, College of Forestry
University of Montana
adrienne.kel...@umontana.edu
(Phone) 651-485-5822

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help using cutreeHybrid

2011-06-23 Thread Estefania Ruiz Vargas

I am using the function cutreeHybrid from the package dynamic Tree Cut and I 
need a list of the resulting clusters but I do not know how to get it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] subset

2011-06-23 Thread yf

Dear all,

How can I do the subset fucntion from the table? I want to do the subset for
the less than 50. 
I tried b8a-subset(b8, (table(g$book)50)==TRUE) but it didn't work.
Thanks.  

table(g$ book )

119 121 134 160 161 170 175 179 190 193 225 226 256 260 
130  89  50  8774  23  8547  93  64  1669  48   38 



--
View this message in context: 
http://r.789695.n4.nabble.com/subset-tp3620557p3620557.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help- subtitles for multiple charts

2011-06-23 Thread jalen

Hello, I have a problem with my script. I don'y know how to apply subtitles.
I have 9 charts per page (for combination of 3 blocks and 3 treatments). I
want to have subtitles for this interaction (e.g. Block A Trt 1, Block A Trt
2, ...)

MBE$bt- interaction(MBE$Block,MBE$trt)
par(mfrow=c(3,3))
for(i in unique(MBE$bt)){
ss - MBE$bt==i
plot(MBE$Year[ss], MBE$DBH[ss])
sm-loess(DBH~Year, data=pMBE[ss,])
x=seq(2004, 2010, by=1)   
points(x, predict(sm, data.frame(Year=x)), type=l)
}

It was possible for the command: 
MBE$trt.name- factor(MBE$trt, label=c(Treatment 1, Treatment 2,
Treatment 3))
xyplot(MBE$DBH~MBE$Year|MBE$trt.name*MBE$Block, xlab=Year, ylab=DBH)-
but I need it for the upper script, cuz the diagram is more appropriate.
I would be grateful for any advise!

--
View this message in context: 
http://r.789695.n4.nabble.com/help-subtitles-for-multiple-charts-tp3620453p3620453.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Merging multiple data sets

2011-06-23 Thread cddesjar

Hi,
I am trying to merge data similar to the example data below

 dat0

idvar1var2var3
2   1 0 1
3   1 0 1
4   0 1 1
5   0 1 1

 dat1

idvar4var5var6
2   1 0 1
3   1 0 1
6   0 1 1
7   0 1 1

 dat2

idvar7var8var9
2   1 0 1
5   1 0 1
6   0 1 1
8   0 1 1

Basically what I'd like to do is combine these variables on id and create
one large data frame that looks like the following.

 dat3

idvar1var2var3   var4var5   var6   var7   var8   var9  
2   1 0 1  1 01  10  
1
3   1 0 1  1 01  NA NA
NA
4   0 1 1  NA  NA  NANA NA NA
5   0 1 1  NA  NA  NANA NA NA
6  NA   NA   NA 0 11  0   11
7  NA   NA   NA 0 11 NA NA NA
8  NA   NA   NA NA  NA NA 011

How can I best do this in R? I've looked into merge but it excludes ids that
aren't in all 3 data sets.


--
View this message in context: 
http://r.789695.n4.nabble.com/Merging-multiple-data-sets-tp3620431p3620431.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] caret's Kappa for categorical resampling

2011-06-23 Thread Harlan Harris

Yes, that's true. On a test set, the highest probability of being in the
smaller class is about 40%. (Incidentally, accuracy on the test set is much
higher when I use the best-according-to-Kappa model instead of the
best-according-to-Accuracy model.)

It looks like the ctree() method supports weights, but all it does is
multiply the class likelihoods, which isn't what I want. (That is, if I
assign a weight of 2 to all of the small-class instances, it generates the
same model, but says that the likelihood for the most-confident instances is
about 80% instead of 40%!)

I'm still not really understanding why Kappa is not acting like a positive
monotonic function of Accuracy, though.

Thanks!


On Wed, Jun 22, 2011 at 8:12 PM, kuhnA03 max.k...@pfizer.com wrote:

  Harlan,

 It looks like your model is predicting (almost) everything to be the
 majority class (accuracy is almost the same as the largest class
 percentage). Try setting a test set aside and use confusionMatrix to look at
 how the model is predicting in more detail. You can try other models that
 will let you weight the minority class higher to get a more balanced
 prediction.

 Max



 On 6/22/11 3:37 PM, Harlan Harris har...@harris.name wrote:

 Hello,

 When evaluating different learning methods for a categorization problem
 with the (really useful!) caret package, I'm getting confusing results from
 the Kappa computation. The data is about 20,000 rows and a few dozen
 columns, and the categories are quite asymmetrical, 4.1% in one category and
 95.9% in the other. When I train a ctree model as:

 model - train(dat.dts,
  dat.dts.class,
  method='ctree',
  tuneLength=8,
  trControl=trainControl(number = 5, workers=1),
  metric='Kappa')

 I get the following puzzling numbers:



   mincriterion  Accuracy  Kappa   Accuracy SD  Kappa SD
   0.01  0.961 0.0609  0.00151  0.0264
   0.15  0.962 0.049   0.00116  0.0248
   0.29  0.963 0.0405  0.00227  0.035
   0.43  0.964 0.0349  0.00257  0.0247
   0.57  0.964 0.0382  0.0022   0.0199
   0.71  0.964 0.0354  0.00255  0.0257
   0.85  0.964 0.036   0.00224  0.024
   0.99  0.965 0.0091  0.00173  0.0203

 (mincriterion determines the likelihood of accepting a split into the
 tree.) The Accuracy numbers look sorta reasonable, if not great; the model
 overfits and barely beats the base rate if it builds a complicated tree. But
 the Kappa numbers go the opposite direction, and here's where I'm not sure
 what's going on. The examples in the vingette show Accuracy and Kappa being
 positively correlated. I thought Kappa was just (Accuracy - baserate)/(1 -
 baserate), but the reported Kappa is definitely not that.

 Suggestions? Aside from looking for a better model, which would be good
 advice here, what metric would you recommend? Thank you!

  -Harlan




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help- subtitles for multiple charts



On Jun 23, 2011, at 1:00 PM, jalen wrote:

Hello, I have a problem with my script. I don'y know how to apply  
subtitles.
I have 9 charts per page (for combination of 3 blocks and 3  
treatments). I
want to have subtitles for this interaction (e.g. Block A Trt 1,  
Block A Trt

2, ...)

MBE$bt- interaction(MBE$Block,MBE$trt)
par(mfrow=c(3,3))
for(i in unique(MBE$bt)){
ss - MBE$bt==i
plot(MBE$Year[ss], MBE$DBH[ss])


If this is working for you properly except for the subtitles ...  
Perhaps:


 plot(MBE$Year[ss], MBE$DBH[ss] sub=levels(MBE$bt)[i])

But I'm not sure you have properly separated the values of Year and  
DBH into categories of 'bt'. Perhaps that logical ss vector will work.  
Cannot tell in absence of data.



sm-loess(DBH~Year, data=pMBE[ss,])
x=seq(2004, 2010, by=1)
points(x, predict(sm, data.frame(Year=x)), type=l)
}

It was possible for the command:
MBE$trt.name- factor(MBE$trt, label=c(Treatment 1, Treatment 2,
Treatment 3))
xyplot(MBE$DBH~MBE$Year|MBE$trt.name*MBE$Block, xlab=Year,  
ylab=DBH)-
but I need it for the upper script, cuz the diagram is more  
appropriate.

I would be grateful for any advise!



My advice: Post reproducible example.

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] two panel figure using barplot2

2011-06-23 Thread Marc Schwartz

On Jun 23, 2011, at 1:19 PM, Adrienne Keller wrote:

 I am trying to make a two panel figure using barplot2 in gplots. When I use 
 par(mfrow=c(1,2)), two panels are created but both barplots are overlaid on 
 top of one another in the first panel and the second panel is left blank.  Am 
 I running into problems because I of some complexities of gplots? Is there a 
 simple way to make a two panel figure with two barplots side by side using 
 barplot2?
 
 Thank you,
 
 Adrienne Keller

Hi,

barplot2() from the gplots CRAN package is built upon the standard barplot() R 
function. I can run the following code here just fine:


  require(gplots)

  par(mfrow = c(1, 2))

  barplot2(1:5)

  barplot2(6:10)


and end up with side by side plots.  If you post a minimal example of the code 
that you are using that reproduces the error, we can help you better in trying 
to determine what may be happening on your end.

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset



On Jun 23, 2011, at 1:47 PM, yf wrote:


Dear all,

How can I do the subset fucntion from the table?


What table?


I want to do the subset for
the less than 50.
I tried b8a-subset(b8, (table(g$book)50)==TRUE) but it didn't work.


And you neither shown us what these structures are nor posted an error  
message.



Thanks.

table(g$ book )



The subset function is for dataframes. Tables are essentially  
matrices. Use logical indexing with [.




119 121 134 160 161 170 175 179 190 193 225 226 256 260
130  89  50  8774  23  8547  93  64  1669  48   38


The top row are names and the bottom row are values

Perhaps (but how b8 enters into this remains unclear) :

tbl - table(g$ book )

tbl[ tbl  50]




--
View this message in context: 
http://r.789695.n4.nabble.com/subset-tp3620557p3620557.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] concordance

2011-06-23 Thread Assieh Rashidi

Is there any package for computing concordance coefficient of incomplete 
ranking?
If there is not, please help me to write it.


 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ranking submodels by AIC (more general question)

2011-06-23 Thread Jan van der Laan


Alexandra,

Have a look at add1 and drop1.

Regards,
Jan


On 06/23/2011 07:32 PM, Alexandra Thorn wrote:

Here's a more general question following up on the specific question I
asked earlier:

Can anybody recommend an R command other than mle.aic() (from the wle
package) that will give back a ranked list of submodels?  It seems like
a pretty basic piece of functionality, but the closest I've been able to
find is stepAIC(), which as far as I can tell only gives back the best
submodel, not a ranking of all submodels.

Thanks in advance,
Alexandra

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help using cutreeHybrid

2011-06-23 Thread Peter Langfelder

On Thu, Jun 23, 2011 at 11:19 AM, Estefania Ruiz Vargas eruiz...@uwo.ca wrote:
 I am using the function cutreeHybrid from the package dynamic Tree Cut and I 
 need a list of the resulting clusters but I do not know how to get it.

Hi,

I'm the author of the package. The function returns a list whose
component 'labels' contains the cluster assignment (label) of all
clustered objects. So you can call the function, say as

cth = cutreeHybrid(myTree, )

and the labels are

labels = cth$labels

You can also use the function cutreeDynamic which (with default
arguments) is a wrapper for cutreeHybrid and returns just the cluster
labels.

HTH,

Peter



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Sent from my Linux computer. Way better than iPad :)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ranking submodels by AIC (more general question)

Thanks for the suggestion.  Those functions only provide part of the
functionality I want.  

After a great deal more of drawing the internet, I've concluded that the
correct answer to my question is dredge() from the package MuMIn.  It
seems to use the same AIC algorithm as AIC, which is perfect for making
comparisons!

Thanks again to everybody who's tried to help me out on this!
Alexandra

On Thu, 2011-06-23 at 21:29 +0200, Jan van der Laan wrote: 
 Alexandra,
 
 Have a look at add1 and drop1.
 
 Regards,
 Jan
 
 
 On 06/23/2011 07:32 PM, Alexandra Thorn wrote:
  Here's a more general question following up on the specific question I
  asked earlier:
 
  Can anybody recommend an R command other than mle.aic() (from the wle
  package) that will give back a ranked list of submodels?  It seems like
  a pretty basic piece of functionality, but the closest I've been able to
  find is stepAIC(), which as far as I can tell only gives back the best
  submodel, not a ranking of all submodels.
 
  Thanks in advance,
  Alexandra
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Loops, Paste, Apply? What is the best way to set up a list of many equations?

2011-06-23 Thread Rita Carreira



Is there a way to apply paste to list(form1 = EQ1, form2 = EQ2, form3 = EQ3, 
form4 = EQ4) such that I don't have to write form1=EQ1 for all my models (I 
might have a list of 20 or more)? I also need the EQs to read the formulas 
associated with them.

For example, below, I was able to automate the name assignment but I could not 
figure out how to to set up the list using paste or other functions:
V2system-c(formula(paste(form1,=,EQ1)), 
formula(paste(form2,=,EQ2)),                                 
formula(paste(form3,=,EQ3)), 
formula(paste(form4,=,EQ4)))names(V2system)-paste(form,1:4,sep=)

Any ideas?Thank you so much in advance for any help provided.
Rita

 HERE'S MY SAMPLE PROGRAM 
YX-as.data.frame(matrix(rnorm(280),ncol=14, nrow=20))  ## generate 
variablesnames(YX) -c(paste(Y, 1:4, sep=),                  ## assign Y 
variables' names              paste(X, 1:10, sep=))                 ## 
assign X variables' names
EQ1 - Y1 ~ X1 + X2 + X4 + X7 + X10      ## equation 1 formulaEQ2 - Y2 ~ X2 + 
X3 + X5 + X8 + X10      ## equation 2 formulaEQ3 - Y3 ~ X5 + X6 + X7 + X9      
      ## equation 3 formulaEQ4 - Y4 ~ X1 + X3 + X4 + X6 + X9       ## equation 
4 formula
## VERSION 1 FOR SETTING UP THE LIST:V1system -list(form1 = EQ1, form2 = EQ2, 
form3 = EQ3, form4 = EQ4)V1system
## VERSION 2 FOR SETTING UP THE 
LIST:V2system-c(formula(paste(form1,=,EQ1)), 
formula(paste(form2,=,EQ2)),             
formula(paste(form3,=,EQ3)), 
formula(paste(form4,=,EQ4)))names(V2system)-paste(form,1:4,sep=)V2system


 HERE ARE MY RESULTS  
## RESULTS OF VERSIONS 1 AND 2 ARE EXACTLY THE SAME:
# $form1# Y1 ~ X1 + X2 + X4 + X7 + X10
# $form2# Y2 ~ X2 + X3 + X5 + X8 + X10
# $form3# Y3 ~ X5 + X6 + X7 + X9
# $form4# Y4 ~ X1 + X3 + X4 + X6 + X9
## STRUCTURE OF THE LISTS IS SAME FOR BOTH VERSIONS AS WELL:
# structure(list(form1 = quote(Y1 ~ X1 + X2 + X4 + X7 + X10), #               
form2 = quote(Y2 ~ X2 + X3 + X5 + X8 + X10), #               form3 = quote(Y3 ~ 
X5 + X6 + X7 + X9), #               form4 = quote(Y4 ~ X1 + X3 + X4 + X6 + 
X9)), #          .Names = c(form1, form2, form3, form4))

Rita
=
If you think education is expensive, try ignorance.--Derek Bok
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rms package - problems with fit.mult.impute

2011-06-23 Thread Frank Harrell

There is a problem passing x in the ... arguments to fit.mult.impute when the
function has a formal argument starting with x (xtrans).  To get around
this specify xtrans=a to fit.mult.impute instead of just listing a.
Frank


Lina Hellström wrote:
 
 Hi!
 Does anyone know how to do the test for goodness of fit of a logistic
 model (in rms package) after running fit.mult.impute?
 
 I am using the rms and Hmisc packages to do a multiple imputation followed
 by a logistic regression model using lrm.
 Everything works fine until I try to run the test for goodness of fit:
 residuals(type=c(gof))
 One needs to specify y=T and x=T in the fit. But I get a warning message
 when I do that with fit.multiple.impute.
 
 a-aregImpute(~med.hist.err+
 med.discr+newLiving+No.drugs+Days.categ+Los+Age+Ward+Sex, n.impute=20,
 nk=0,data=med.err)
 ddist-datadist(Age,No.drugs,Days.categ, Sex, Living, Ward)
 options(datadist=ddist)
 
 fmi-fit.mult.impute(med.hist.err~Age+No.drugs+Days.categ+Sex+Living+Ward,
 fitter=lrm, x=T, y=T,a,data=med.err)
 Error in 1:n.impute : NA/NaN argument
 In addition: Warning message:
 In 1:n.impute : numerical expression has 18 elements: only the first used
 
 It works to do the fit.mult.impute without x and y=T but then I get the
 following warning message when running residuals
 gof-residuals(fmi, type=c(gof))
 Error in residuals.lrm(fmi, type = c(gof)) :
   you did not specify y=T in the fit
 
 It was no problem to do the goodness of fit test when I ran the lrm on my
 complete data set without multiple imputation and fit.mult.impute.
 model.lrm-lrm(med.hist.err~Age+No.drugs+Days +Sex+Living+Ward, x=TRUE,
 y=TRUE)
 gof-residuals(model.lrm, type=c(gof))
 
 Thanks
 Lina
 _
 PhD student
 Linnaeus University
 Sweden
 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Rms-package-problems-with-fit-mult-impute-tp3619814p3621045.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] reading the results of a within subject test

2011-06-23 Thread Ann

Hi, I ran the following in R (on item means):

aov(VAR ~(a*b)+Error(item/(a*b)), data = item)

I got this result:

Error: item
  Df Sum SqMean Sq F value   Pr(F)   
a   1 7.7249e+13 7.7249e+13  11.329 0.003934 **
Residuals 16 1.0910e+14 6.8187e+12
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Error: item:a
  Df Sum SqMean Sq F value   Pr(F)   
a   1 1.1218e+14 1.1218e+14  8.6909 0.009449 **
b 1 2.7264e+10 2.7264e+10  0.0021 0.963912   
Residuals 16 2.0653e+14 1.2908e+13
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Error: item:b
Df Sum SqMean Sq F value  Pr(F)  
b   1 6.0538e+13 6.0538e+13  2.7157 0.11886  
a:b  1 1.7894e+14 1.7894e+14  8.0272 0.01199 *
Residuals   16 3.5666e+14 2.2292e+13  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Error: item:a:b
Df Sum SqMean Sq F value Pr(F)
a:b  1 4.9726e+12 4.9726e+12   0.629 0.4461
Residuals   10 7.9061e+13 7.9061e+12 

and I'm not sure what is what in there. Is it reporting a between analysis
as well?Why does the interaction between a and b appear several times?

 When I ran the anova I did get a warning message (In aov(VAR ~ (a * b) +
Error(item/(a * b)),  :
  Error() model is singular) 

I'm confused...

Thanks!

--
View this message in context: 
http://r.789695.n4.nabble.com/reading-the-results-of-a-within-subject-test-tp3621044p3621044.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using 'foreach' for parallel computing

2011-06-23 Thread Stacey Wood

Hi all,

I'm currently working on updating an R package to run for loops in parallel
to speed up computation time.  I'm using the 'foreach' package with a
foreach loop.  When I run my code inside the loop, I get an error message
that a number of the functions aren't recognized (even though the functions
have been defined); if I call my R package inside the loop, everything runs
properly.  As you can see, this is a problem since I can't call my own
package in the source code for the package itself!  Any advice would be
appreciated.

Thanks,
Stacey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] gcc-4.5.2 and install.packages(glmnet)?

2011-06-23 Thread Juergen Rose

Hi,

is there any chance to install glmnet with gcc-4.5.2? For me it fails on
all systems with:

trying URL
'http://mirrors.softliste.de/cran/src/contrib/glmnet_1.7.tar.gz'
Content type 'application/x-gzip' length 522888 bytes (510 Kb)
opened URL
==
downloaded 510 Kb

* installing *source* package ‘glmnet’ ...
 This package has only been tested with gfortran.
 So some checks are needed.
 R_HOME is /usr/lib64/R
Attempting to determine R_ARCH...
R_ARCH is 
Attempting to detect how R was configured for Fortran 90
Unsupported Fortran 90 compiler or Fortran 90
compilers unavailable! Stop!
ERROR: configuration failed for package ‘glmnet’

How can I get more information, what a problem has happened?

Otherwise I have no problem to compile a small Fortran 90 program:
root@orca:/home/rose/Txt/src/Test/Fortran(57)# cat hello.f90 
program Hello
  print *,'Hello world'
end program Hello
root@orca:/home/rose/Txt/src/Test/Fortran(58)# gfortran -cpp -dM
hello.f90
root@orca:/home/rose/Txt/src/Test/Fortran(59)# ./a.out
 Hello world

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Time-series analysis with treatment effects - statistical approach

2011-06-23 Thread J.Morgenroth

Mike Marchywka wrote:

I discovered a way to do repetitive tasks that can be concisely specified
using
something called a computer.

Now that's funny :)

There were not controlled tests. It was a field experiment testing the
effects that various pavement designs have on underlying soil moisture. Two
designs incorporated a porous pavement surface course, while two others were
based on standard impervious concrete pavement...the control was just bare,
exposed soil.

As you can see from the graph, the control responds quickly to rainfall
events, but dries out quickly as well due to evaporation. The porous
pavement allows for quick infiltration of precipitation, while the
impervious pavement eventually allows infiltration of rainfall, but it's
delayed.

My objective is to be able to differentiate between the pavement treatments,
such that I can state with statistical confidence that porous pavements
affects underlying soil moisture differently than impervious pavements.

I think this is obvious just looking at it, but I wanted to be able to back
it up with stats. What I'd done previously is to average by week. But as I
mentioned, I thought that an anova table with 104 rows relating to each week
was a poor way of analyzing the data. But that being said, it effectively
allows me to check for treatment-related differences.

Thanks for the suggestions to date. Maybe the more I explain what I'm trying
to achieve, the more focussed the suggestions will be. The vaguer the
question, the broader the response, right?

Thanks again,
Justin

--
View this message in context:
http://r.789695.n4.nabble.com/Time-series-analysis-with-treatment-effects-statistical-approach-tp3615856p3621179.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] packages which call functions which run C code...

2011-06-23 Thread Steven Wolf

I am looking at the function ks.test in the stats package and trying to
figure out why it gives a different result for a p-value than does the
corresponding function in MATLAB.  I am hoping for one of two responses:

 

1.   You know about ks.tests and have a familiarity with both the R and
MATLAB algorithms and can tell me why this should be happening.

2.   You know where I can find the C code so that I can decipher it.

 

Thanks,

-Steve Wolf


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Generate the next column from previous column

2011-06-23 Thread Mkkl

Hi, I'm quite new to R and are stuck with the following problem.

Lets say I have a column consisting of a 1 and the rest zero's, called G0.
G0 - c(1,rep(0,5))

Now what I would like to do is to generate G1 from G0, and G2 from G1 etc...
Just for the simplicity, let's say I need the first entry of the column to
be increased by 5 each time.

How could I do this?
Thanks already!

Kind regards,

Michael


--
View this message in context: 
http://r.789695.n4.nabble.com/Generate-the-next-column-from-previous-column-tp3621051p3621051.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] trying to import xls or xlsx files

2011-06-23 Thread wwreith

library(xlsReadWrite)
mydata-read.xls(file path, header=TRUE)

however if I change xls to csv it works just fine. Any ideas what I'm doing
wrong? I have have also using the package gdata with the exact same error.

Below is the error that pops up.

Error in findPerl(verbose = verbose) : 
  perl executable not found. Use perl= argument to specify the correct path.
Error in file.exists(tfn) : invalid 'file' argument



--
View this message in context: 
http://r.789695.n4.nabble.com/trying-to-import-xls-or-xlsx-files-tp3620580p3620580.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] change plot area in barplot2

2011-06-23 Thread Adrienne Keller

I would like to change the plot area for a figure I made with barplot2  
so that it is rectangular (for example, 5 x 3 inches) rather than  
square, which is the default. What is the best way to do this?


Thanks,

Adrienne

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] maximizing the LDV model

2011-06-23 Thread Edward Bowora

Hi

Please find attached document to see my problem that I wish you would assist
me to solve.

I want to maximise the LDV model to get alpha estimates.

I am looking forward to hearing from you soon.

Edward
Actuarial Science student
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help- subtitles for multiple charts

2011-06-23 Thread jalen

http://r.789695.n4.nabble.com/file/n3620982/mbeFORUM.csv mbeFORUM.csv 

I uploaded my data and one more time the script (*adjusted version*):

pMBE- MBE[MBE$left!=0,]
pMBE$bt- interaction(pMBE$Block,pMBE$trt)
par(mfrow=c(3,3), oma=c(2,0,2,0))
for(i in unique(pMBE$bt)){
ss - pMBE$bt==i
plot(pMBE$Year[ss], pMBE$DBH[ss], xlab=Year, ylab=DBH [in],
ylim=range(0:12), col=28)
sm-loess(DBH~Year, data=pMBE[ss,])
x=seq(2004, 2010, by=1)   
points(x, predict(sm, data.frame(Year=x)), type=l, col=650,
title(Diameter, adj=0.5, outer=T))
}

--
View this message in context: 
http://r.789695.n4.nabble.com/help-subtitles-for-multiple-charts-tp3620453p3620982.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Estimating choice models at the individual level

2011-06-23 Thread Chris Chapman

The bayesm package does HB models for discrete choice. In practice, a hurdle 
is likely to be translating your data (if it's fielded in Sawtooth Software) to 
the correct format for bayesm (which uses a list format instead of a matrix, 
and difference coding instead of dummy coding).

(BTW, note another R list: R-SIG-DCM, which you might consider joining if 
you're not already a member. https://stat.ethz.ch/mailman/listinfo/r-sig-dcm )

Good luck,

-- Chris

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Data Analytics Corp.
Sent: Tuesday, June 21, 2011 7:45 AM
To: r-help@r-project.org
Subject: [R] Estimating choice models at the individual level

Hi,

I have a discrete choice model to estimate for a client that I originally 
planned to estimate as an aggregate model using a clogit routine.  Now the 
client is asking for results for many segments of the respondents which would 
mean, if I stayed with my original plan, I would have to estimate a large 
number of models.  I could certainly do this, but I'm thinking that it would be 
better to estimate a Hierarchical Bayes model to get coefficients for each 
individual respondent.  This way, I could pull out the people I need for a 
particular segment and use just those coefficients.  Sawtooth's program for 
MaxDiff can do this.  
Is there any R package to do an HB estimation for a discrete choice
(logit) model?

Thanks,

Walt



Walter R. Paczkowski, Ph.D.
Data Analytics Corp.
44 Hamilton Lane
Plainsboro, NJ 08536

(V) 609-936-8999
(F) 609-936-3733
w...@dataanalyticscorp.com
www.dataanalyticscorp.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about Software for Data analysis book

2011-06-23 Thread Fritz Scholz

the answer to the missing mars data can be found on pages 176/177 of the
referenced book.


--
View this message in context: 
http://r.789695.n4.nabble.com/question-about-Software-for-Data-analysis-book-tp1043713p3621277.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] new to R need urgent help!

2011-06-23 Thread elisheva corn

hi all-

I am doing some research, have never used R before until today and need to
understand the following program for a project.
if some one could PLEASE help me understand this program ASAP i would
GREATLY appreciate it (any syntax/ statistic comments would be great)

PLEASE PLEASE HELP!!  THANKYOU!!!
-on a side note, it seems to me that R doesnt include the pv, and it was
calculated seperatly, is this true?


 fit=gee(foci~as.factor(time)*cond,id=exper,data=drt,family=poisson(link =
log))
Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
running glm to get initial regression estimate
  (Intercept) as.factor(time)24
 3.051177 -2.705675
  condHypoxia as.factor(time)24:condHypoxia
-0.402259  1.429034
 pv=2*(1-pnorm(abs(summary(fit)$coef[,5])))
 data.frame(summary(fit)$coef,pv)
   Estimate Naive.S.E.   Naive.z Robust.S.E.
Robust.z
(Intercept)3.051177 0.02221052 137.37527  0.04897055
62.306363
as.factor(time)24 -2.705675 0.10890056 -24.84537  0.19987174
-13.537057
condHypoxia   -0.402259 0.03907961 -10.29332  0.10661248
-3.773095
as.factor(time)24:condHypoxia  1.429034 0.12549576  11.38711  0.17867421
7.997988
pv
(Intercept)   0.00e+00
as.factor(time)24 0.00e+00
condHypoxia   1.612350e-04
as.factor(time)24:condHypoxia 1.332268e-15
 ftable(table(drt$cond,drt$time,predict(fit)))
 0.345501643340608 1.37227675004058 2.64891772174934
3.05117673373261


Oxia0.5  00
0  485
24 3150
00
Hypoxia 0.5  00
3460
24   0  449
00
 ## 3-th term gives the difference between the Hypoxia/Oxia at time=0.5
 ## the difference between Hypoxia/Oxia at time=24
 L=matrix(c(0,0,1,1),nrow=1)
 fit$coef[L==1]
  condHypoxia as.factor(time)24:condHypoxia
-0.402259  1.429034
 L%*%fit$coef
 [,1]
[1,] 1.026775
 wald.test(fit$robust.variance,fit$coef,L=L)
Wald test:
--

Chi-squared test:
X2 = 23.8, df = 1, P( X2) = 1.1e-06


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] trying to import xls or xlsx files

2011-06-23 Thread Gabor Grothendieck

On Thu, Jun 23, 2011 at 2:00 PM, wwreith reith_will...@bah.com wrote:
 library(xlsReadWrite)
 mydata-read.xls(file path, header=TRUE)

 however if I change xls to csv it works just fine. Any ideas what I'm doing
 wrong? I have have also using the package gdata with the exact same error.

 Below is the error that pops up.

 Error in findPerl(verbose = verbose) :
  perl executable not found. Use perl= argument to specify the correct path.
 Error in file.exists(tfn) : invalid 'file' argument


Regarding the read.xls in gdata the error message means that either
perl is not installed or else its not on your path.  If its in
~/bin/perl, say,  then  try:

library(gdata)
DF - read.xls(myfile, ...whatever..., perl = ~/bin/perl)

or if you are on Windows and its in C:\Perl\bin\perl.exe, say, then try:

library(gdata)
DF - read.xls(myfile, ...whatever..., perl = C:\\Perl\\bin\\perl.exe)

Alternately put perl on your path and then you can omit the perl= argument.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] packages which call functions which run C code...

2011-06-23 Thread Peter Langfelder

On Thu, Jun 23, 2011 at 4:46 PM, Steven Wolf wolfs...@msu.edu wrote:
 I am looking at the function ks.test in the stats package and trying to
 figure out why it gives a different result for a p-value than does the
 corresponding function in MATLAB.  I am hoping for one of two responses:



 1.       You know about ks.tests and have a familiarity with both the R and
 MATLAB algorithms and can tell me why this should be happening.

 2.       You know where I can find the C code so that I can decipher it.

You can download the source R bundle (a .tar.gz or tar.bz, for example
http://cran.stat.ucla.edu/src/base/R-2/R-2.13.0.tar.gz), unpack it,
then navigate to the R source directory, go to subdirectory
src/library/stats/src and look at the file ks.c. The R function
ks.test is defined in src/library/stats/R/ks.test.R.

HTH,

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] packages which call functions which run C code...

2011-06-23 Thread Peter Ehlers


On 2011-06-23 16:46, Steven Wolf wrote:

I am looking at the function ks.test in the stats package and trying to
figure out why it gives a different result for a p-value than does the
corresponding function in MATLAB.  I am hoping for one of two responses:



1.   You know about ks.tests and have a familiarity with both the R and
MATLAB algorithms and can tell me why this should be happening.

2.   You know where I can find the C code so that I can decipher it.





Go to

 https://svn.r-project.org/R/trunk/src/library/stats/R/ks.test.R

for the code of ks.test()

and to

 https://svn.r-project.org/R/trunk/src/library/stats/src/ks.c

for the relevant C code. Or download the source code from
your favourite CRAN mirror.

Peter Ehlers



Thanks,

-Steve Wolf


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] change plot area in barplot2

2011-06-23 Thread Marc Schwartz

On Jun 23, 2011, at 4:11 PM, Adrienne Keller wrote:

 I would like to change the plot area for a figure I made with barplot2 so 
 that it is rectangular (for example, 5 x 3 inches) rather than square, which 
 is the default. What is the best way to do this?
 
 Thanks,
 
 Adrienne

Hi,

The size of the plot is going to be largely dependent upon the output device 
being used and the dimension settings for the device. If you are going to use a 
bitmapped device (eg. png()) as opposed to a vector device (eg. pdf()), then 
the physical size of the image will also be dependent upon the resolution of 
the display (in pixels per inch) upon which the file is being viewed.

See ?Devices for more information on the available output devices on your 
system. Then select the particular device you wish to use for details on how to 
adjust the dimensions.

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] new to R need urgent help!

2011-06-23 Thread Abhijit Dasgupta

On Jun 23, 2011, at 4:42 PM, elisheva corn elishevac...@gmail.com wrote:

 hi all-
 
 I am doing some research, have never used R before until today and need to
 understand the following program for a project.
 if some one could PLEASE help me understand this program ASAP i would
 GREATLY appreciate it (any syntax/ statistic comments would be great)
 
 PLEASE PLEASE HELP!!  THANKYOU!!!
 -on a side note, it seems to me that R doesnt include the pv, and it was
 calculated seperatly, is this true?
 
 
 fit=gee(foci~as.factor(time)*cond,id=exper,data=drt,family=poisson(link =
 log))
You apparently have count data (foci) which is measured repeatedly within 
exper, and you're interested in how foci changes with time and condition 
including their interaction. The code fits a generalized estimating equation 
(GEE) model, which can be an appropriate model for repeated measures data. See, 
for example, Diggle, Liang, Zeger  Heagerty for background. 
 Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
 running glm to get initial regression estimate
  (Intercept) as.factor(time)24
 3.051177 -2.705675
  condHypoxia as.factor(time)24:condHypoxia
-0.402259  1.429034
 pv=2*(1-pnorm(abs(summary(fit)$coef[,5])))
 data.frame(summary(fit)$coef,pv)

The gee package doesn't compute the value directly, though other functions like 
lm, glm and others do.  What the code does is use the robust z statistic, which 
is the estimate/robust se, and relate it to the standard normal distribution. 
   Estimate Naive.S.E.   Naive.z Robust.S.E.
 Robust.z
 (Intercept)3.051177 0.02221052 137.37527  0.04897055
 62.306363
 as.factor(time)24 -2.705675 0.10890056 -24.84537  0.19987174
 -13.537057
 condHypoxia   -0.402259 0.03907961 -10.29332  0.10661248
 -3.773095
 as.factor(time)24:condHypoxia  1.429034 0.12549576  11.38711  0.17867421
 7.997988
pv
 (Intercept)   0.00e+00
 as.factor(time)24 0.00e+00
 condHypoxia   1.612350e-04
 as.factor(time)24:condHypoxia 1.332268e-15
 ftable(table(drt$cond,drt$time,predict(fit)))
 0.345501643340608 1.37227675004058 2.64891772174934
 3.05117673373261
 
 
 Oxia0.5  00
 0  485
24 3150
 00
 Hypoxia 0.5  00
 3460
24   0  449
 00
 ## 3-th term gives the difference between the Hypoxia/Oxia at time=0.5
 ## the difference between Hypoxia/Oxia at time=24
 L=matrix(c(0,0,1,1),nrow=1)
 fit$coef[L==1]
  condHypoxia as.factor(time)24:condHypoxia
-0.402259  1.429034
 L%*%fit$coef
 [,1]
 [1,] 1.026775
 wald.test(fit$robust.variance,fit$coef,L=L)
 Wald test:
 --
 
 Chi-squared test:
 X2 = 23.8, df = 1, P( X2) = 1.1e-06
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] trying to import xls or xlsx files

2011-06-23 Thread Abhijit Dasgupta

Gabor's answer explains the error perfectly. You might want to look at the xlsx 
package as well as the RODBC package if you're on Windows. RODBC is really 
fast, if you can use it. 

Abhijit

On Jun 23, 2011, at 2:00 PM, wwreith reith_will...@bah.com wrote:

 library(xlsReadWrite)
 mydata-read.xls(file path, header=TRUE)
 
 however if I change xls to csv it works just fine. Any ideas what I'm doing
 wrong? I have have also using the package gdata with the exact same error.
 
 Below is the error that pops up.
 
 Error in findPerl(verbose = verbose) : 
  perl executable not found. Use perl= argument to specify the correct path.
 Error in file.exists(tfn) : invalid 'file' argument
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/trying-to-import-xls-or-xlsx-files-tp3620580p3620580.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generate the next column from previous column

2011-06-23 Thread Steven Wolf

Basically you need to set up a recursive relationship.  I'd do this with a
2D array:

G = numeric(6*N)
dim(G) = c(6,N)
G[,1] = c(1,rep(0,5))
for (i in 2:N){G[,i]=G[,i-1]+5* c(1,rep(0,5))}

HTH,
-Steve

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Mkkl
Sent: Thursday, June 23, 2011 5:17 PM
To: r-help@r-project.org
Subject: [R] Generate the next column from previous column

Hi, I'm quite new to R and are stuck with the following problem.

Lets say I have a column consisting of a 1 and the rest zero's, called G0.
G0 - c(1,rep(0,5))

Now what I would like to do is to generate G1 from G0, and G2 from G1 etc...
Just for the simplicity, let's say I need the first entry of the column to
be increased by 5 each time.

How could I do this?
Thanks already!

Kind regards,

Michael


--
View this message in context:
http://r.789695.n4.nabble.com/Generate-the-next-column-from-previous-column-
tp3621051p3621051.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 'Rscript -e' and stdout() puzzle

2011-06-23 Thread Benjamin Tyner


Hello,

I am curious to know why the output of

   Rscript -e cat(R.version.string,stdout())

includes a trailing  1, whereas

   Rscript -e cat(R.version.string)

does not. I have tried various mechanisms to subvert this behavior, such as

   Rscript -e invisible(con-stdout()); cat(R.version.string, con);
   rm(con); q()

but the trailing  1 remains.

Thanks,
Ben


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reading the results of a within subject test

2011-06-23 Thread Stephen Ellison

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Ann [trivialesc...@gmail.com]
Subject: [R] reading the results of a within subject test

Hi, I ran the following in R (on item means):

 aov(VAR ~(a*b)+Error(item/(a*b)), data = item)

 I'm confused...

Could you say more about what 'item' in 'data=item' is? The data= argument is 
usually a data frame which would normally (for your model specification) 
contain a continuous variable 'VAR', and probably categorical variables 
(factors) a, b and item. It is possible to have a data fram called item 
containing a variable called item, but I'm not quite sure that's what you have.

Assuming that you have - or at least hoping so, 'cos I can't see how the model 
would work properly otherwise - you seem to have specified a two-way anova with 
interactions against fixed effect factors a and b, (with a*b expanding to a, b 
and a:b) and an Error term which effectively says you want to test your a, b 
and a:b fixed effects against random effects a, b and a:b (from the a*b) nested 
within Item, so expanding to item:a, item:b and item:a:b. That is why you got 
all those extra tables. But that sounds a bit unlikely. Most experimenters 
either nest items in a:b subgroups (ie observing several different items for 
each a:b group) or, for blocked designs, use all combinations of a:b treatments 
on every one of a number of different items (obviously hard to do unless the 
treatments are reversible  - an example might be experiments like reversible 
exterior temperature/pressure effects on, say, flow meter readings).

I'd guess the more likely scenario is that you have multiple different items 
per a:b group. That would require only one error term - item, uniquely 
identified perhaps as a:b:item or simply as item if the items are already 
individually labelled. 

But you also say 'on the item means'. Asking for item as an error term implies 
that you have residual error within item. You clearly can't see that if you 
only supply item means (which may be one reason aov warned you that you had a 
singular model - that last table would have been seeking to calculate the 
residual variance, and after the item means were taken out, there wan't any 
residual variance). If you had a nested model (multiple different items per a:b 
subgroup), _either_ you should have provided the raw observations (and not the 
means) and specified Error(item) or Error(a:b:item) for which aov() would give 
you a table of tests against the residual MS and another table with tests 
against the item MS (the latter being the more sensible of the two if items are 
random and appreciably different), _or_ you should supply the item means, in 
which case you need not include item in the model because the variation between 
item means within each group is simply the residual error!
 . 

Admittedly I'm speculating quite a bit on what experiment you've actually done, 
but with a bit of luck some of the above will give you a clue...

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 'Rscript -e' and stdout() puzzle

2011-06-23 Thread William Dunlap

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Benjamin Tyner
 Sent: Thursday, June 23, 2011 5:31 PM
 To: r-help@r-project.org
 Subject: [R] 'Rscript -e' and stdout() puzzle

 Hello,

 I am curious to know why the output of

 Rscript -e cat(R.version.string,stdout())

 includes a trailing  1, whereas

 Rscript -e cat(R.version.string)

Use file=stdout().  Otherwise it prints the value of
as.character(stdout()), which is 1.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

 does not. I have tried various mechanisms to subvert this 
 behavior, such as

 Rscript -e invisible(con-stdout()); cat(R.version.string, con);
 rm(con); q()

 but the trailing  1 remains.

 Thanks,
 Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] trying to import xls or xlsx files