[R] R Help

2010-04-02 Thread Ryan Cooper
Has anyone programmed the Nonparametric Canonical Correlation method in R?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] time series problem: time points don't match

2010-04-02 Thread Brad Patrick Schneid

Gabor:
That is not the ideal solution, but it definitely works to provide me with
the easier alternative.  Thanks for the reply!  
-- 
View this message in context: 
http://n4.nabble.com/time-series-problem-time-points-don-t-match-tp1748387p1748706.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Exporting Nuopt from splus to R

2010-04-02 Thread Jp2010

Hi all,
Thanks for the wonderful forum with all the valuable help and comments here.

I have been a splus user for the past 7 to 8 years and now crossing the mind
of changing over to R. Have been doing a lot of reading and one of the main
reasons is being an open source and the wonderful things that comes with
that. 

My question is though, is it possible to export any of the function or
librarys that come with splus to R.? 

For my specific situation. Windows platform, if there is a compiled s.dll is
there a way we can get this working in R. I would think if it s function or
source file it probably can be written without much difficulty in R. But
what about the compiled data. I am not a system programmer so don't know
much about compiling/ undoing that. 

From my understanding it is going to be difficult, is that my understanding
right.?

Thanks

-- 
View this message in context: 
http://n4.nabble.com/Exporting-Nuopt-from-splus-to-R-tp1748681p1748681.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ODD and EVEN numbers

2010-04-02 Thread girlme80
Excuse me Carl Withoft! 

For your information, this is not my homework. I'm just helping my friend in a 
part of her R code.

And everytime I ask a question here, it's just a SMALL PART of the 
2-pages-program that I am doing. And for your information, the answers that I 
get, I still think on how to make use of them. It does not mean that when I get 
answers, I use them immediately without thinking!

And you have no right to tell me that coz I don't remember you answering any of 
my questions.

IF YOU DON'T KNOW THE ANSWERS TO MY QUESTIONS, just keep quiet, and let the 
smart guys share their thoughts.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ODD and EVEN numbers

2010-04-02 Thread Detlef Steuer

Just to give you a hint for the future:

If you ask google for odd, even, R you get a messages from 2003 as second 
match:

---
Dave Caccace wrote:
 Hi,
 I'm trying to create a function, jim(p) which varies
 depending on whether the value of p is odd or even. I
 was trying to use th eIf function, but i cant work out
 a formula to work out if p is odd or even.
 Thanks,
 Dave

if(p %% 2) odd else even

Uwe Ligges 
--
(Hi Uwe!)

My guess is, using so much capitals in your e-mail has turned away about 1000
helpful souls from your future posts.

May be reading the posting guide and a one minute try to solve the
problem by yourself googling would be appropriate?
Think for a moment: Google would have given an answer (the answer!) in 1
minute. You wrote an e-mail to quite a few thousands of subscribers.
That needed more than a minute on your side. And how many hours of
reading time took it off of your readers?

Seasonal greetings
Detlef



On Thu, 01 Apr 2010 17:27:01 -0700
girlm...@yahoo.com wrote:

 Excuse me Carl Withoft! 
 
 For your information, this is not my homework. I'm just helping my friend in 
 a part of her R code.
 
 And everytime I ask a question here, it's just a SMALL PART of the 
 2-pages-program that I am doing. And for your information, the answers that I 
 get, I still think on how to make use of them. It does not mean that when I 
 get answers, I use them immediately without thinking!
 
 And you have no right to tell me that coz I don't remember you answering any 
 of my questions.
 
 IF YOU DON'T KNOW THE ANSWERS TO MY QUESTIONS, just keep quiet, and let the 
 smart guys share their thoughts.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Biplot for PCA using labdsv package

2010-04-02 Thread Dilys Vela
Hi everyone,

I am doing PCA with labdsv package. I was trying to create a biplot graphs
in order to observe arrows related to my variables. However when I run the
script for this graph, the console just keep saying:

*Error in nrow(y) : element 1 is empty;
   the part of the args list of 'dim' being evaluated was:
   (x)*

could please someone tell me what this means? what i am doing wrong? I will
really appreciate any suggestions and help.

Thanks,

Dilys

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: Using a string as a variable name - revisited

2010-04-02 Thread Petr PIKAL
Hi

Without some insight about foo, list or counts it is impossible to say 
what is wrong.

mat-matrix(1:12, 3,4)
colnames(mat)-letters[1:4]
DF-as.data.frame(mat)
fac-factor(names(DF))
 fac
[1] a b c d
Levels: a b c d
 ff-fac[3]
 ff
[1] c
Levels: a b c d
 DF[[ff]]
[1] 7 8 9
 ff-fac[1]
 DF[[ff]]
[1] 1 2 3

As you can see with DF as data frame and ff extracted from vector of name 
as a factor everything seems to work OK.

When anybody except you wont to do

foo - list$taxon[match(5,list$item)]

he gets

Error in list$taxon : object of type 'builtin' is not subsettable

So provide some toy example or at least structure of objects you use and 
you probably get solution.

Regards
Petr


 
r-help-boun...@r-project.org napsal dne 01.04.2010 23:24:55:

 I would like to revisit a problem that was discussed previously (see
 quoted discussion below). I am trying to do the same thing, using a
 string to indicate a column with the same name. I am making foo a
 string taken from a list of names. It matches the row where item =
 5, and picks the corresponding taxon
 
  foo - list$taxon[match(5,list$item)]
 
 Let's say this returns foo as Aulacoseira_islandica. I have another
 matrix counts with column headers corresponding to the taxon list.
 But, when I try to access the data in the Aulacoseira_islandica
 column, it instead uses the data from another column. For instance...
 
  columndata - counts[[foo]]
 
 ...returns the data from the wrong column. What it seems to be doing
 is converting the text Aulacoseira_islandica to a number (25, for
 some reason) and reading the count data from column number 25, instead
 of from the column labelled with Aulacoseira_islandica.
 
 If I try...
 
  columndata - counts$Aulacoseira_islandica
 
 ...it works fine. Any thoughts?
 
 -Euan
 NRRI-University of Minnesota Duluth
 
 
 __
 Jason Horn-2
 Oct 20, 2006; 06:28pm
 [R] Using a string as a variable name
 
 Is it possible to use a string as a variable name?  For example:
 
 foo=var1
 frame$foo   # frame is a data frame with with a column titled var1
 
 This does not work, unfortunately.  Am I just missing the correct
 syntax to make this work?
 
 - Jason
 __
 Oct 20, 2006; 06:30pm
 Re: [R] Using a string as a variable name
 
 frame[[foo]]
 
 On 10/20/06, Jason Horn [hidden email] wrote:
 
  Is it possible to use a string as a variable name?  For example:
 
  foo=var1
  frame$foo   # frame is a data frame with with a column titled var1
 
  This does not work, unfortunately.  Am I just missing the correct
  syntax to make this work?
 
 
  - Jason
 
 -- 
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] roccomp

2010-04-02 Thread Ravi Kulkarni

The ROCR package has methods to compute AUC and related methods. You might
want to check it out.

Ravi
-- 
View this message in context: 
http://n4.nabble.com/roccomp-tp1748818p1748903.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Derivative of a smooth function

2010-04-02 Thread FMH

Dear All,

I've been searching for appropriate codes to compute the rate of change and the 
curvature of  nonparametric regression model whish was denoted by a smooth 
function but unfortunately don't manage to do it. I presume that such 
characteristics from a smooth curve can be determined by the first and second 
derivative operators.

The following are the example of fitting a nonparametric regression model via 
smoothing spline function from the Help file in R.

###
attach(cars)
plot(speed, dist, main = data(cars)    smoothing splines)
cars.spl - smooth.spline(speed, dist)
lines(cars.spl, col = blue)
lines(smooth.spline(speed, dist, df=10), lty=2, col = red)
legend(5,120,c(paste(default [C.V.] = df =,round(cars.spl$df,1)),s( * , df 
= 10)), col = c(blue,red), lty = 1:2, bg='bisque')
detach()

###


Could someone please advice me the appropriate way to determine such 
derivatives on the curves which were fitted by the function above and would 
like to thank you in advance.

Cheers
Fir 





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to save a model in DB and retrieve It

2010-04-02 Thread Daniele Amberti
I'm wondering how to save an object (models like lm, loess, etc) in a DB to 
retrieve and use it afterwards, an example:

wind_ms - abs(rnorm(24*30)*4+8)
air_kgm3 - rnorm(24*30, 0.1)*0.1 + 1.1
wind_dg - rnorm(24*30) * 360/7
ms - c(0:25)
kw_mm92 - c(0,0,0,20,94,205,391,645,979,1375,1795,2000,2040)
kw_mm92 - c(kw_mm92, rep(2050, length(ms)-length(kw_mm92)))
modelspline - splinefun(ms, kw_mm92)
kw - abs(modelspline(wind_ms) - (wind_dg)*2 + (air_kgm3 - 1.15)*300 + 
rnorm(length(wind_ms))*10)
#plot(wind_ms, kw)
windDat - data.frame(kw, wind_ms, air_kgm3, wind_dg)
windDat[windDat$wind_ms  3, 'kw'] - 0
model - loess(kw ~ wind_ms + air_kgm3 + wind_dg, data = windDat, enp.target = 
10*5*3) #, span = 0.1)

modX - serialize(model, connection = NULL, ascii = T)

Channel - odbcConnect(someSysDSN; UID=aUid; PWD=aPwd)
sqlQuery(Channel,
paste(
INSERT INTO GRT.GeneratorsModels
   ([cGeneratorID]
   ,[tModel]
   VALUES
   (1,,
   paste(', gsub(', '', rawToChar(modX)), ', sep = ''),
   ), sep = ) )
# Up to this it is working correctly,
# in DB I have the modX variable
# Problem arise retrieving data and 64kb limit:
  strQ - 
SELECT  CONVERT(varchar(max), tModel) AS tModel
FROMGRT.GeneratorsModels
WHERE   (cGeneratorID = 1)

x - sqlQuery(Channel, strQ, stringsAsFactors = F, believeNRows = FALSE)
x - sqlQuery(Channel, strQ, stringsAsFactors = F, believeNRows = FALSE) #read 
error



Above code is working for simplier models that have a shorter representation in 
variable modX.
Any advice on how to store and retieve this kind of objects?
Thanks
Daniele


ORS Srl

Via Agostino Morando 1/3 12060 Roddi (Cn) - Italy
Tel. +39 0173 620211
Fax. +39 0173 620299 / +39 0173 433111
Web Site www.ors.it


Qualsiasi utilizzo non autorizzato del presente messaggio e dei suoi allegati è 
vietato e potrebbe costituire reato.
Se lei avesse ricevuto erroneamente questo messaggio, Le saremmo grati se 
provvedesse alla distruzione dello stesso
e degli eventuali allegati.
Opinioni, conclusioni o altre informazioni riportate nella e-mail, che non 
siano relative alle attività e/o
alla missione aziendale di O.R.S. Srl si intendono non  attribuibili alla 
società stessa, né la impegnano in alcun modo.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hat matrix and MSEP

2010-04-02 Thread linda garcia
Dear all,
   I have 100 x 5 data matrix and 100 x 1 response vector. I
have calculated Hat matirx and the diagonal of the matrix. Now I want to
know if the diagonals say something about future prediction( Mean square
error for preidiction)? Can I get variance explained for future data?

Thanks alot

-- 
Linda Garcia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R: Generative Topographic Map

2010-04-02 Thread mauede
I am running GTM on the same datda space points but changing the number of 
latent space points, the number of basis functions and parameter sigma.
I found a combination of such parameters that works fine.
On the other hand on page 7 of the paper The Generative Topographic Mapping 
by Swensen, Bishop, and Williams, it is stated Thre is no over-fitting if the 
number of sample points is increased since the nymber of degrees of freedom in 
the model is controlled by the mapping function y(x;W).

However, since you have transalted the code from matLab to R I am pretty sure 
you know hwt is the cause of the fllowing messages, which routines generates 
them and under which circumstances. Once I have these details clear, I can 
possible try and avoid the event that causes them 

 gtm_trn: Warning -- M-Step matrix singular, using pinv.\n

1: In chol.default(A, pivot = TRUE) : matrix not positive definite
2: In gtm_trn(T, FI, W, lambda, 1, b, 2, quiet = FALSE, minSing = 0.01) :
  Using 40 out of 40 eigenvalues

Thank you so much.
Maura 

-Messaggio originale-
Da: Ondrej Such [mailto:ondrej.s...@gmail.com]
Inviato: gio 01/04/2010 17.07
A: mau...@alice.it
Oggetto: Re: Generative Topographic Map
 
Hello Maura,

Thank you for your email.

Marcus Svensen, one of method's authors works at Microsoft, and can give you
more insight how it works. Email marcu...@microsoft.com, home page
http://research.microsoft.com/en-us/um/people/markussv/. I also found his
Ph.D. thesis very insightful.

I believe it is never useful  to have as many data points in latent space as
there are data points. And with 2000 points I can't imagine going beyond
latent dimension 3 or 4, in applying GTM.

I've heard that GTM can be useful mostly in situations, when data follows a
relatively smooth manifold.

Best,

--Ondrej



2010/4/1 mau...@alice.it

  Thank you. I figured that out myself last night. I always forget that
 read.table does not actually read data into a matrix.
 GTM MatLab toolbox comes with  a nice guide to use the package which may as
 well become an R vignette.

 Anyway, I got the singular matrix warnings myself and do not know whether I
 should be concerned about it or not.
 Moreover, I do not know how to avoid that.
 I will go through some other experiments keeping the data space samples and
 dimensionality fixed and changing some of the input parameters.
 I stress our goal is NOT visualization. We do not know the intrinsic
 dimensionality of that data space samples. Therefore we can only proceed by
 trial--error. That is we vary the dimensionality of the embedding space. In
 this experiment the dimensionality of the data space is 7 so we start out
 projecting our original data to a 1D embedding space, then we try out a 2D
 embedding space, ..., all the way up to a 6D embedding space. Since we do
 not know the intrinsic dimensionality of the original data, we need a method
 to evaluate the reliability of the projection. To assess that we reconstruct
 the data back from the embedding to the data space and here we calculate the
 RMSD between the original data and the reconstructed ones. Basically, using
 RMSD, we need as many reconstructed points as the original number. Such a
 requirement is achieved by choosing as many points in the latent space as in
 the data space. Can such a choice be the cause of the matrix singularity ?
 Futhermore, is the number of basis functions related to the number of latent
 space points somehow ?
 Unluckily, even GTM MatLab documentation is not explicitly providing any
 clear criteria about the parameters choice and their dependence, if any.

 Thank you,
 Maura


 -Messaggio originale-
 Da: Ondrej Such [mailto:ondrej.s...@gmail.com ondrej.s...@gmail.com]
 Inviato: gio 01/04/2010 11.16
 A: mau...@alice.it
 Oggetto: Re: Generative Topographic Map


 Hello,

 the problem that's tripping the package is that T is a data.frame and not a
 matrix.

 Simply replacing

  T - read.table(DHA_TNH.txt)

 with

 T - as.matrix(read.table(DHA_TNH.txt))

 makes the code run (though warnings about singular matrices remain, I'm not
 sure to what degree that is worrisome). I'd be curious, as to how you'd
 suggest improving the documentation.

 Hope this helps,

 --Ondrej

 2010/3/31 mau...@alice.it

   I tried to use R version of package
  I noticed the original MatLab Pckage is much better documented.
  I had a look at the R demo code gtm_demo and found that variable Y is
  used in advanced of being created:
  I wrote my own few lines as follows:
   inDir -  C:/Documents and Settings/Monville/Alanine
 Dipeptide/DBP1/DHA
 
   setwd(inDir)
   T - read.table(DHA_TNH.txt)
   L - 3
   X - matrix(nrow=nrow(T),ncol=3,byrow=TRUE)
   MU - matrix(nrow=round(nrow(T)/5), ncol=L)
 
   for(i in 1:ncol(X)) {
 for(j in 1:nrow(X)) {
X[j,i] - RANDU()
 }
   }
 
   for(i in 1:ncol(MU)) {
for(j in 1:nrow(MU)) {
   MU[j,i] - RANDU()
}
   }
   sigma -1
 
   FI - gtm_gbf(MU,sigma,X)
   W - 

[R] plot area: secondary y-axis does not display well

2010-04-02 Thread Muhammad Rahiz

Dear useRs,

I'm having a slight problem with plotting on 2 axes. While the following 
code works alright on screen, the saved output does not turn out as 
desired i.e. the secondary y-axis does not display fully.


Just run the code and look at image output. Suggestions please...

thanks,

Muhammad

---
rm(list=ls())
x - 1:100
y - 200:300

par(mar=c(5,5,5,7)+0.1) # inner margin
par(oma=c(3,3,3,7))# outer margin
png(image.png)
plot(x,cex=0.5,type=l,lty=2,pch=3,xlab=year,ylab=x-axis,las=1,col=blue)
par(new=TRUE)
plot(y,cex=0.5,type=l,lty=2,pch=3,xlab=,ylab=,las=1,axes=FALSE,ylim=c(0,500),col=red)
axis(4,las=1)
mtext(y-axis,side=4,line=3)
legend(topleft,col=c(blue,red),lty=2,legend=c(x,y),bty=n)
box(figure,col=red)
box(plot,col=blue)
dev.off()

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What should I do regarding DLL attempted to change... warning ?

2010-04-02 Thread Tal Galili
Hi all,

The call to:
library(rJava)

Results in the following warning massage:

Warning message:
In inDL(x, as.logical(local), as.logical(now), ...) :
  DLL attempted to change FPU control word from 8001f to 9001f


After some searching I found the following explanation:

 R expects all calls to DLLs (including the initializing call) to leave the
 FPU control word unchanged. Many run-time libraries reset the FPU control
 word during initialization; this will cause problems in R, and will result
 in a warning message like DLL attempted to change FPU control word from
 8001f to 9001f. The value 8001f that gets reported is in the format
 expected by the C library routine _controlfp; the raw value that is used
 in the FPU register is 037F.


Also with a few old discussions that explain (for a package developer) how
to avoid this.


The question is, should I, as a useR, do anything regarding this warning
massage ?


I use winXP , here is my sessionInfo()


R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] rJava_0.8-3



Thanks,
Tal


Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: plot area: secondary y-axis does not display well

2010-04-02 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 02.04.2010 12:12:02:

 Dear useRs,
 
 I'm having a slight problem with plotting on 2 axes. While the following 

 code works alright on screen, the saved output does not turn out as 
 desired i.e. the secondary y-axis does not display fully.
 
 Just run the code and look at image output. Suggestions please...
 
 thanks,
 
 Muhammad
 
 ---
 rm(list=ls())
 x - 1:100
 y - 200:300
 
 par(mar=c(5,5,5,7)+0.1) # inner margin
 par(oma=c(3,3,3,7))# outer margin

 png(image.png)

png device does not know about your margin settings. It was called 
**after** call to par. So put your

par(mar=c(5,5,5,7)+0.1) # inner margin
par(oma=c(3,3,3,7))# outer margin

after call to png.

Regards
Petr

 
plot(x,cex=0.5,type=l,lty=2,pch=3,xlab=year,ylab=x-axis,las=1,col=blue)
 par(new=TRUE)
 
plot(y,cex=0.5,type=l,lty=2,pch=3,xlab=,ylab=,las=1,axes=FALSE,ylim=c(0,
 500),col=red)
 axis(4,las=1)
 mtext(y-axis,side=4,line=3)
 legend(topleft,col=c(blue,red),lty=2,legend=c(x,y),bty=n)
 box(figure,col=red)
 box(plot,col=blue)
 dev.off()
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] time series problem: time points don't match

2010-04-02 Thread Gabor Grothendieck
On Thu, Apr 1, 2010 at 7:08 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 Perhaps something like this:

 library(zoo)
 library(chron)
 # read in data

 Lines1 - date            time            level           temp
 2009/10/01 00:01:52.0      2.8797      18.401
 2009/10/01 00:16:52.0      2.8769      18.382
 2009/10/01 00:31:52.0      2.8708      18.309
 2009/10/01 00:46:52.0      2.8728      18.285
 2009/10/01 01:01:52.0      2.8716      18.245
 2009/10/01 01:16:52.0      2.8710      18.190

 Lines2 - date            time            level          temp
 2009/10/01 00:11:06.0      2.9507      18.673
 2009/10/01 00:26:06.0      2.9473      18.630
 2009/10/01 00:41:06.0      2.9470      18.593
 2009/10/01 00:56:06.0      2.9471      18.562
 2009/10/01 01:11:06.0      2.9451      18.518
 2009/10/01 01:26:06.0      2.9471      18.480

 DF1 - read.table(textConnection(Lines1), header = TRUE, as.is = TRUE)
 DF2 - read.table(textConnection(Lines2), header = TRUE, as.is = TRUE)

 z1 - zoo(DF1[3:4], chron(DF1[,1], DF1[,2], format=c(Y/M/D, H:M:S)))
 z2 - zoo(DF2[3:4], chron(DF2[,1], DF2[,2], format=c(Y/M/D, H:M:S)))

 # process inputs z1 and z2
 # aggregating into 15 minute intervals and merging

 z1a - aggregate(z1, trunc(time(z1), 00:15:00), tail, n = 1)
 z2a - aggregate(z2, trunc(time(z2), 00:25:00), tail, n = 1)

The last line should have been:

z2a - aggregate(z2, trunc(time(z2), 00:15:00), tail, n = 1)


 z - merge(z1a, z2a)


 On Thu, Apr 1, 2010 at 1:35 PM, Brad Patrick Schneid bpsch...@gmail.com 
 wrote:

 Hi,
 I have a time series problem that I would like some help with if you have
 the time.  I have many data from many sites that look like this:

 Site.1
 date            time            level           temp
 2009/10/01 00:01:52.0      2.8797      18.401
 2009/10/01 00:16:52.0      2.8769      18.382
 2009/10/01 00:31:52.0      2.8708      18.309
 2009/10/01 00:46:52.0      2.8728      18.285
 2009/10/01 01:01:52.0      2.8716      18.245
 2009/10/01 01:16:52.0      2.8710      18.190

 Site.2
 date            time            level          temp
 2009/10/01 00:11:06.0      2.9507      18.673
 2009/10/01 00:26:06.0      2.9473      18.630
 2009/10/01 00:41:06.0      2.9470      18.593
 2009/10/01 00:56:06.0      2.9471      18.562
 2009/10/01 01:11:06.0      2.9451      18.518
 2009/10/01 01:26:06.0      2.9471      18.480

 As you can see, the times do not match up.  What I would like to do is be
 able to merge these two data sets to the nearest time stamp by creating a
 new time between the two; something like this:


 date            new.time        level.1       temp.1    level.2         
 temp.2
 2009/10/01 00:01:52.0      2.8797      18.401   NA             NA
 2009/10/01 00:13:59.0      2.8769      18.382      2.9507      18.673
 2009/10/01 00:28:59.0      2.8708      18.309      2.9473      18.630
 2009/10/01 00:43:59.0      2.8728      18.285      2.9470      18.593
 2009/10/01 00:59:59.0      2.8716      18.245     2.9471      18.562
 2009/10/01 01:13:59.0      2.8710      18.190     2.9451      18.518
 2009/10/01 01:26:06.0       NA              NA          2.9471      18.480

 Note that the sites may not match in the # of observations and a return of
 NA would be necessary, but a deletion of that time point all together for
 both sites would be preferred.

 A possibly easier alternative would be a way to assign generic times for
 each observation according to the time interval, so that the 1st observation
 for each day would have a time = 00:00:00 and each consecutive one would be
 15 minutes later.

 Thanks for any suggestions.

 Brad

 --
 View this message in context: 
 http://n4.nabble.com/time-series-problem-time-points-don-t-match-tp1748387p1748387.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Odp: plot area: secondary y-axis does not display well

2010-04-02 Thread Muhammad Rahiz

Thanks Ivan, Jim and Petr.

The output turns out as desired after I've taken your suggestions.

Muhammad

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What should I do regarding DLL attempted to change... warning ?

2010-04-02 Thread Duncan Murdoch

On 02/04/2010 7:01 AM, Tal Galili wrote:

Hi all,

The call to:
library(rJava)

Results in the following warning massage:

Warning message:
In inDL(x, as.logical(local), as.logical(now), ...) :
  DLL attempted to change FPU control word from 8001f to 9001f


After some searching I found the following explanation:


R expects all calls to DLLs (including the initializing call) to leave the
FPU control word unchanged. Many run-time libraries reset the FPU control
word during initialization; this will cause problems in R, and will result
in a warning message like DLL attempted to change FPU control word from
8001f to 9001f. The value 8001f that gets reported is in the format
expected by the C library routine _controlfp; the raw value that is used
in the FPU register is 037F.



Also with a few old discussions that explain (for a package developer) how
to avoid this.


The question is, should I, as a useR, do anything regarding this warning
massage ?


It's a bug in the rJava package, so you should report it to the 
maintainer of that package.


Duncan Murdoch




I use winXP , here is my sessionInfo()


R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] rJava_0.8-3



Thanks,
Tal


Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What should I do regarding DLL attempted to change... warning ?

2010-04-02 Thread Tal Galili
Thanks Duncan and Romain,
I'll go and do that.

with regards,
Tal




Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Fri, Apr 2, 2010 at 2:10 PM, Duncan Murdoch murd...@stats.uwo.ca wrote:

 On 02/04/2010 7:01 AM, Tal Galili wrote:

 Hi all,

 The call to:
 library(rJava)

 Results in the following warning massage:

 Warning message:
 In inDL(x, as.logical(local), as.logical(now), ...) :
  DLL attempted to change FPU control word from 8001f to 9001f


 After some searching I found the following explanation:

  R expects all calls to DLLs (including the initializing call) to leave
 the
 FPU control word unchanged. Many run-time libraries reset the FPU control
 word during initialization; this will cause problems in R, and will
 result
 in a warning message like DLL attempted to change FPU control word from
 8001f to 9001f. The value 8001f that gets reported is in the format
 expected by the C library routine _controlfp; the raw value that is used
 in the FPU register is 037F.



 Also with a few old discussions that explain (for a package developer) how
 to avoid this.


 The question is, should I, as a useR, do anything regarding this warning
 massage ?


 It's a bug in the rJava package, so you should report it to the maintainer
 of that package.

 Duncan Murdoch



 I use winXP , here is my sessionInfo()


 R version 2.10.1 (2009-12-14)
 i386-pc-mingw32

 locale:
 [1] LC_COLLATE=English_United States.1252
 [2] LC_CTYPE=English_United States.1252
 [3] LC_MONETARY=English_United States.1252
 [4] LC_NUMERIC=C
 [5] LC_TIME=English_United States.1252

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] rJava_0.8-3



 Thanks,
 Tal


 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |  972-52-7275845
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
 www.r-statistics.com (English)

 --

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] what is the significance of RSq in earth function??

2010-04-02 Thread vibha patel
Hello,

I'm using earth function for Multivariate Adaptive Regression splines.

what is the significance of RSq in earth function??

following's the code.
printed value is of RSq.

 tr.wage-sample(1:nrow(HCMwage), 0.8*nrow(HCMwage))
 tst.wage- (1:nrow(HCMwage))[-tr.wage]

HCMwageModel-earth(V2~V3+V4+V5+V6+V7+V8+V9+V10+V11+V12+W,data=HCMwage[tr.wage,])
 prdHCMwage-predict(HCMwageModel, newdata=HCMwage[tst.wage,])
 wg2HCM-HCMwage$V2[tst.wage]
 RwageHCM-(1-sum((wg2HCM - prdHCMwage)^2)/sum((wg2HCM-mean(wg2HCM))^2))
 print(RwageHCM)
[1] 0.3204129


Thanks and Regards,
Vibha

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Restricting optimisation algorithm's parameter space in GNLM

2010-04-02 Thread dominikj
Hello, 

I have a problem. I am  using the NLME library to fit a non-linear model. There 
is a linear  component to the model that has a couple parameter values that can 
only  be positive (the coefficients are embedded in a sqrt). When I try and  
fit the model to data the search algorithm tries to see if a negative  value 
for one of these parameter values will produce an optimal fit. When  it does 
so, it crashes because the equation can not have a negative  value because its 
in a sqrt function. 

QUESTION: How do I  restrict the optimisation algorithm's  parameter space so 
it does not  search negative values when using GNLM? Are there other Libraries 
that Fit Non-linear models and allow for one to control the parameter space the 
search algorithm is restricted by?

Any help would be  appreciated.
Thanks 
Dom 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot area: secondary y-axis does not display well

2010-04-02 Thread Ivan Calandra

Hi Muhammad,

The problem is that you set the par() options before creating your png.
I've tried, and it works if you do this:
... # x and y
png(image.png)
par(mar=c(5,5,5,7)+0.1) # inner margin
par(oma=c(3,3,3,7))# outer margin
... # rest of your code

HTH,
Ivan

Le 4/2/2010 12:12, Muhammad Rahiz a écrit :

Dear useRs,

I'm having a slight problem with plotting on 2 axes. While the 
following code works alright on screen, the saved output does not turn 
out as desired i.e. the secondary y-axis does not display fully.


Just run the code and look at image output. Suggestions please...

thanks,

Muhammad

---
rm(list=ls())
x - 1:100
y - 200:300

par(mar=c(5,5,5,7)+0.1) # inner margin
par(oma=c(3,3,3,7))# outer margin
png(image.png)
plot(x,cex=0.5,type=l,lty=2,pch=3,xlab=year,ylab=x-axis,las=1,col=blue) 


par(new=TRUE)
plot(y,cex=0.5,type=l,lty=2,pch=3,xlab=,ylab=,las=1,axes=FALSE,ylim=c(0,500),col=red) 


axis(4,las=1)
mtext(y-axis,side=4,line=3)
legend(topleft,col=c(blue,red),lty=2,legend=c(x,y),bty=n)
box(figure,col=red)
box(plot,col=blue)
dev.off()

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] timeseries plot

2010-04-02 Thread vibha patel
Hello,

I am using plot( ) function to plot time-series.

it takes time-series object as an argument
but i want to plot predicted data with training set, to compare them.

is there any function available?


Vibha

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What should I do regarding DLL attempted to change... warning ?

2010-04-02 Thread Prof Brian Ripley

On Fri, 2 Apr 2010, Duncan Murdoch wrote:


On 02/04/2010 7:01 AM, Tal Galili wrote:

Hi all,

The call to:
library(rJava)

Results in the following warning massage:

Warning message:
In inDL(x, as.logical(local), as.logical(now), ...) :
  DLL attempted to change FPU control word from 8001f to 9001f


After some searching I found the following explanation:


R expects all calls to DLLs (including the initializing call) to leave the
FPU control word unchanged. Many run-time libraries reset the FPU control
word during initialization; this will cause problems in R, and will result
in a warning message like DLL attempted to change FPU control word from
8001f to 9001f. The value 8001f that gets reported is in the format
expected by the C library routine _controlfp; the raw value that is used
in the FPU register is 037F.



Also with a few old discussions that explain (for a package developer) how
to avoid this.


The question is, should I, as a useR, do anything regarding this warning
massage ?


It's a bug in the rJava package, so you should report it to the maintainer of 
that package.


I suspect it is much more likely to be in the Java installation being 
linked to, so you should make sure that is fully updated and report 
its version to the maintainer if the problem persists.


It does not do this for me with Sun Java 1.6.0u18.




Duncan Murdoch




I use winXP , here is my sessionInfo()


R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] rJava_0.8-3



Thanks,
Tal


Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)

--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-SIG-Finance] Derivative of a smooth function

2010-04-02 Thread Jeff Ryan
Please keep in mind this question has absolutely nothing to do with
finance, and therefore needs to instead be directed to R-help.

Thanks in advance for keeping the R-finance list on topic.

Jeff

On Fri, Apr 2, 2010 at 3:36 AM, FMH kagba2...@yahoo.com wrote:

 Dear All,

 I've been searching for appropriate codes to compute the rate of change and 
 the curvature of  nonparametric regression model whish was denoted by a 
 smooth function but unfortunately don't manage to do it. I presume that such 
 characteristics from a smooth curve can be determined by the first and second 
 derivative operators.

 The following are the example of fitting a nonparametric regression model via 
 smoothing spline function from the Help file in R.

 ###
 attach(cars)
 plot(speed, dist, main = data(cars)    smoothing splines)
 cars.spl - smooth.spline(speed, dist)
 lines(cars.spl, col = blue)
 lines(smooth.spline(speed, dist, df=10), lty=2, col = red)
 legend(5,120,c(paste(default [C.V.] = df =,round(cars.spl$df,1)),s( * , 
 df = 10)), col = c(blue,red), lty = 1:2, bg='bisque')
 detach()

 ###


 Could someone please advice me the appropriate way to determine such 
 derivatives on the curves which were fitted by the function above and would 
 like to thank you in advance.

 Cheers
 Fir





 ___
 r-sig-fina...@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-sig-finance
 -- Subscriber-posting only. If you want to post, subscribe first.
 -- Also note that this is not the r-help list where general R questions 
 should go.




-- 
Jeffrey Ryan
jeffrey.r...@insightalgo.com

ia: insight algorithmics
www.insightalgo.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Derivative of a smooth function

2010-04-02 Thread Ravi Varadhan
Please learn how to use `RsiteSearch' before posting questions to the list:

 RSiteSearch(derivative smooth function)

This should have provided you with plenty of solutions. 

Ravi.


Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvarad...@jhmi.edu


- Original Message -
From: FMH kagba2...@yahoo.com
Date: Friday, April 2, 2010 4:39 am
Subject: [R] Derivative of a smooth function
To: r-help@r-project.org
Cc: r-sig-fina...@stat.math.ethz.ch


 Dear All,
 
 I've been searching for appropriate codes to compute the rate of 
 change and the curvature of  nonparametric regression model whish was 
 denoted by a smooth function but unfortunately don't manage to do it. 
 I presume that such characteristics from a smooth curve can be 
 determined by the first and second derivative operators.
 
 The following are the example of fitting a nonparametric regression 
 model via smoothing spline function from the Help file in R.
 
 ###
 attach(cars)
 plot(speed, dist, main = data(cars)    smoothing splines)
 cars.spl - smooth.spline(speed, dist)
 lines(cars.spl, col = blue)
 lines(smooth.spline(speed, dist, df=10), lty=2, col = red)
 legend(5,120,c(paste(default [C.V.] = df 
 =,round(cars.spl$df,1)),s( * , df = 10)), col = c(blue,red), 
 lty = 1:2, bg='bisque')
 detach()
 
 ###
 
 
 Could someone please advice me the appropriate way to determine such 
 derivatives on the curves which were fitted by the function above and 
 would like to thank you in advance.
 
 Cheers
 Fir 
 
 
 
 
 
 __
 R-help@r-project.org mailing list
 
 PLEASE do read the posting guide 
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] build Mac distribution for R package

2010-04-02 Thread wenjun zheng
Dear R users,

can somebody give me some suggestions about how to build Mac distribution on
my own Mac OS

Thanks

-- 
Wenjun

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] timeseries plot

2010-04-02 Thread Gabor Grothendieck
Here are a few ways:

Try this:

set.seed(123)
TS - ts(1:25 + rnorm(25))
tt - time(TS)
tt.pred - end(tt)[1] + 1:10
both - ts(c(TS, predict(lm(TS ~ tt), list(tt = tt.pred
ts.plot(both, TS, gpars = list(type = o, col = 2:1, pch = 20))

and read ?ts, ?start, ?ts.plot and next time please provide some
sample data using dput.  See last line to every message and the
posting guide.

On Fri, Apr 2, 2010 at 8:14 AM, vibha patel vibhapatel...@gmail.com wrote:
 Hello,

 I am using plot( ) function to plot time-series.

 it takes time-series object as an argument
 but i want to plot predicted data with training set, to compare them.

 is there any function available?


 Vibha

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plots don't update with xlab, etc. What am I doing wrong.

2010-04-02 Thread Marshall Feldman
Hi,

I've been struggling with this problem the last few days and finally 
discovered it's happening at a very fundamental level. Going through 
Stephen Turner's tutorial on ggplot2, I entered these base graphics 
commands:

  with(diamonds, plot(carat,price))
  with(diamonds, plot(carat,price), xlab=Weight in Carats,
ylab=Price in USD, main=Diamonds are expensive!)

The first command works as expected and draws the plot with labels 
carat and price and no title. The second command makes R redraw the 
plot (I can see it clear and redraw), but it's identical to the first! 
What am I doing wrong?

 Marsh Feldman



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] build Mac distribution for R package

2010-04-02 Thread David Winsemius


On Apr 2, 2010, at 9:26 AM, wenjun zheng wrote:


Dear R users,

can somebody give me some suggestions about how to build Mac  
distribution on

my own Mac OS


It appears you have not read the most basic background information yet:

http://cran.r-project.org/doc/manuals/R-admin.pdf

--
David.



Thanks

--
Wenjun

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plots don't update with xlab, etc. What am I doing wrong.

2010-04-02 Thread David Winsemius


On Apr 2, 2010, at 9:56 AM, Marshall Feldman wrote:


Hi,

I've been struggling with this problem the last few days and finally
discovered it's happening at a very fundamental level. Going through
Stephen Turner's tutorial on ggplot2, I entered these base graphics
commands:


with(diamonds, plot(carat,price))
with(diamonds, plot(carat,price), xlab=Weight in Carats,

   ylab=Price in USD, main=Diamonds are expensive!)


Remove the extraneous ).

--  
David.


The first command works as expected and draws the plot with labels
carat and price and no title. The second command makes R redraw  
the

plot (I can see it clear and redraw), but it's identical to the first!
What am I doing wrong?

Marsh Feldman


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2010-04-02 Thread Terry Therneau

 I'm using rpart function for creating regression trees.
 now how to measure the fitness of regression tree???
 
 thanks n Regards,
 Vibha

I read R-help as a digest so often come late to a discussion.  Let me
start by being the first to directly answer the question:

 fit - rpart(time ~ age +ph.ecog,lung)
 summary(fit)
Call:
rpart(formula = time ~ age + ph.ecog, data = lung)
  n= 228 

  CP nsplit rel error   xerror  xstd
1 0.0351  0 1.000 1.009949 0.1137819
2 0.01459053  1 0.9648333 1.049636 0.1282259
3 0.01324335  3 0.9356523 1.090562 0.1301632
4 0.0100  7 0.8810284 1.063609 0.1298557

Node number 1: 228 observations,complexity param=0.0351
  mean=305.2325, MSE=44176.93 
  left son=2 (51 obs) right son=3 (177 obs)
  Primary splits:
...

The relative error and cross-validated relative error columns above, for
a regression tree, are equal to 1-R^2.  In this case none of the splits
are useful; even the naive non-cross-validated improvement for the first
split isn't much (R^2  .04).

Now to the larger debate.  I do not find trees as useless as Frank (does
anyone).  I like to use them for initial data exploration, in the same
fashion as a scatterplot.  But I fight the same battle that he does with
some colleages and customers: they are so very easy to interpret that
the results are often severely over-interpreted, sometimes to the point
that the tree did more harm than good.

All forward stepwise procedures are unstable.  Particularly with rich
data sets, such as I see each day in the medical field, there are
mulitple overlapping/correlated predictors.  Small changes in the data
will completely change the order of a forward stepwise regression.
Anyone who puts faith in the ORDER of inclusion as a measure of worth is
like a flag in a fitful breeze.
 A bigger problem with rpart is the users consistenly ignore the xerror
column above, and print out (and believe) bigger trees than they should.
Once the xerror bottoms out you are almost certainly looking at random
noise.  Since the xerror curve often has a long flat bottom the 1SE rule
is better (anything within 1SE of the min is a tie, use the smallest of
a set of tied models).

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summing data based on certain conditions

2010-04-02 Thread Steve Murray

Dear all,

Thanks for the contributions so far. I've had a look at these and the closest 
I've come to solving it is the following:

 data_ave - ave(data$rammday, by=c(data$month, data$year))
Warning messages:
1: In split.default(x, g) :
  data length is not a multiple of split variable
2: In split.default(seq_along(x), f, drop = drop, ...) :
  data length is not a multiple of split variable


I'm slightly confused by the warning message, as the data lengths do appear the 
same:

 dim(data)
[1] 1073    6
 length(data$year)
[1] 1073
 length(data$month)
[1] 1073


Maybe the approach I'm taking is wrong. Any suggestions would be gratefully 
received.

Many thanks,

Steve



 Date: Wed, 31 Mar 2010 23:31:25 +0200
 From: stephan.kola...@gmx.de
 To: smurray...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R] Summing data based on certain conditions

 ?by may also be helpful.

 Stephan


 Steve Murray schrieb:
 Dear all,

 I have a dataset of 1073 rows, the first 15 which look as follows:

 data[1:15,]
 date year month day rammday thmmday
 1 3/8/1988 1988 3 8 1.43 0.94
 2 3/15/1988 1988 3 15 2.86 0.66
 3 3/22/1988 1988 3 22 5.06 3.43
 4 3/29/1988 1988 3 29 18.76 10.93
 5 4/5/1988 1988 4 5 4.49 2.70
 6 4/12/1988 1988 4 12 8.57 4.59
 7 4/16/1988 1988 4 16 31.18 22.18
 8 4/19/1988 1988 4 19 19.67 12.33
 9 4/26/1988 1988 4 26 3.14 1.79
 10 5/3/1988 1988 5 3 11.51 6.33
 11 5/10/1988 1988 5 10 5.64 2.89
 12 5/17/1988 1988 5 17 37.46 20.89
 13 5/24/1988 1988 5 24 9.86 9.81
 14 5/31/1988 1988 5 31 13.00 8.63
 15 6/7/1988 1988 6 7 0.43 0.00


 I am looking for a way by which I can create monthly totals of rammday 
 (rainfall in mm/day; column 5) by doing the following:

 For each case where the month value and the year are the same (e.g. 3 and 
 1988, in the first four rows), find the mean of the the corresponding 
 rammday values and then times by the number of days in that month (i.e. 31 
 in this case).

 Note however that the number of month values in each case isn't always the 
 same (e.g. in this subset of data, there are 4 values for month 3, 5 for 
 month 4 and 5 for month 5). Also the months will of course recycle for the 
 following years, so it's not simply a case of finding a monthly total for 
 *all* the 3s in the whole dataset, just those associated with each year in 
 turn.

 How would I go about doing this in R?

 Any help will be gratefully received.

 Many thanks,

 Steve



 _
 We want to hear all your funny, exciting and crazy Hotmail stories. Tell us 
 now

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


  
_

Do you have a story that started on Hotmail? Tell us now
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cross-validation for parameter selection (glm/logit)

2010-04-02 Thread Jay
If my aim is to select a good subset of parameters for my final logit
model built using glm(). What is the best way to cross-validate the
results so that they are reliable?

Let's say that I have a large dataset of 1000's of observations. I
split this data into two groups, one that I use for training and
another for validation. First I use the training set to build a model,
and the the stepAIC() with a Forward-Backward search. BUT, if I base
my parameter selection purely on this result, I suppose it will be
somewhat skewed due to the 1-time data split (I use only 1 training
dataset)

What is the correct way to perform this variable selection? And are
the readily available packages for this?

Similarly, when I have my final parameter set, how should I go about
and make the final assessment of the models predictability? CV? What
package?


Thank you in advance,
Jay

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] POSIX primer

2010-04-02 Thread Doran, Harold
I have not used POSIX classes previously and now have a need to use them. I 
have sports data with times of some athletes after different events. I need to 
perform some simple analyses using the times. I think I've figured out how to 
do this. I just want to confirm with others who have more experience that this 
is indeed the correct approach. If not, please suggest a more appropriate way.

Suppose I have times for two athletes after event 1.

 times - c('14:15', '16:45')

Now, I use strptime() as follows

  x - strptime(times, %M:%S)
 x
[1] 2010-04-02 00:14:15 2010-04-02 00:16:45

 class(x)
[1] POSIXt  POSIXlt

Now, I want the average time across all athletes as well as the min and max, so 
I do:

 mean(x); min(x); max(x)
[1] 2010-04-02 00:15:30 EDT
[1] 2010-04-02 00:14:15 EDT
[1] 2010-04-02 00:16:45 EDT

Now, I want to rank order the athletes:

 rank(x)
Error in if (xi == xj) 0L else if (xi  xj) 1L else -1L :
  missing value where TRUE/FALSE needed

But, I can rank order the following.

 rank(times)
[1] 1 2

I don't need the date in the object x, but I can't figure out how to remove it. 
Nonetheless, it doesn't seem to affect anything.

 x
[1] 2010-04-02 00:14:15 2010-04-02 00:16:45

Is this the right approach for using time variables and performing some 
computations on them? Or, is there another approach I should look at.

Thanks,
Harold

 sessionInfo()
R version 2.10.0 (2009-10-26)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] lme4_0.999375-32   Matrix_0.999375-31 lattice_0.17-26

loaded via a namespace (and not attached):
[1] grid_2.10.0  tools_2.10.0

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summing data based on certain conditions

2010-04-02 Thread ONKELINX, Thierry
Dear Steve,

Multiplying the mean with the number of observations is essentially the same as 
summing the numbers.

Have a look at the plyr packages.

library(plyr)
ddply(data, c(month, year), function(x){
c(MeanMultiplied = mean(x$ramm) * nrow(x), Sum = sum(x$ramm))
})



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie  Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics  Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey
  

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] Namens Steve Murray
 Verzonden: vrijdag 2 april 2010 16:37
 Aan: stephan.kola...@gmx.de; gunter.ber...@gene.com
 CC: r-help@r-project.org
 Onderwerp: Re: [R] Summing data based on certain conditions
 
 
 Dear all,
 
 Thanks for the contributions so far. I've had a look at these 
 and the closest I've come to solving it is the following:
 
  data_ave - ave(data$rammday, by=c(data$month, data$year))
 Warning messages:
 1: In split.default(x, g) :
   data length is not a multiple of split variable
 2: In split.default(seq_along(x), f, drop = drop, ...) :
   data length is not a multiple of split variable
 
 
 I'm slightly confused by the warning message, as the data 
 lengths do appear the same:
 
  dim(data)
 [1] 1073    6
  length(data$year)
 [1] 1073
  length(data$month)
 [1] 1073
 
 
 Maybe the approach I'm taking is wrong. Any suggestions would 
 be gratefully received.
 
 Many thanks,
 
 Steve
 
 
 
  Date: Wed, 31 Mar 2010 23:31:25 +0200
  From: stephan.kola...@gmx.de
  To: smurray...@hotmail.com
  CC: r-help@r-project.org
  Subject: Re: [R] Summing data based on certain conditions
 
  ?by may also be helpful.
 
  Stephan
 
 
  Steve Murray schrieb:
  Dear all,
 
  I have a dataset of 1073 rows, the first 15 which look as follows:
 
  data[1:15,]
  date year month day rammday thmmday
  1 3/8/1988 1988 3 8 1.43 0.94
  2 3/15/1988 1988 3 15 2.86 0.66
  3 3/22/1988 1988 3 22 5.06 3.43
  4 3/29/1988 1988 3 29 18.76 10.93
  5 4/5/1988 1988 4 5 4.49 2.70
  6 4/12/1988 1988 4 12 8.57 4.59
  7 4/16/1988 1988 4 16 31.18 22.18
  8 4/19/1988 1988 4 19 19.67 12.33
  9 4/26/1988 1988 4 26 3.14 1.79
  10 5/3/1988 1988 5 3 11.51 6.33
  11 5/10/1988 1988 5 10 5.64 2.89
  12 5/17/1988 1988 5 17 37.46 20.89
  13 5/24/1988 1988 5 24 9.86 9.81
  14 5/31/1988 1988 5 31 13.00 8.63
  15 6/7/1988 1988 6 7 0.43 0.00
 
 
  I am looking for a way by which I can create monthly 
 totals of rammday (rainfall in mm/day; column 5) by doing the 
 following:
 
  For each case where the month value and the year are the 
 same (e.g. 3 and 1988, in the first four rows), find the mean 
 of the the corresponding rammday values and then times by the 
 number of days in that month (i.e. 31 in this case).
 
  Note however that the number of month values in each case 
 isn't always the same (e.g. in this subset of data, there are 
 4 values for month 3, 5 for month 4 and 5 for month 5). Also 
 the months will of course recycle for the following years, so 
 it's not simply a case of finding a monthly total for *all* 
 the 3s in the whole dataset, just those associated with each 
 year in turn.
 
  How would I go about doing this in R?
 
  Any help will be gratefully received.
 
  Many thanks,
 
  Steve
 
 
 
  _
  We want to hear all your funny, exciting and crazy Hotmail 
 stories. 
  Tell us now
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 _
 
 Do you have a story that started on Hotmail? Tell us now 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de 

Re: [R] POSIX primer

2010-04-02 Thread Gabor Grothendieck
On Fri, Apr 2, 2010 at 10:49 AM, Doran, Harold hdo...@air.org wrote:
 I have not used POSIX classes previously and now have a need to use them. I 
 have sports data with times of some athletes

The main reason to use POSIXct is if you need time zones.
If you don't then you might be better off with chron.  See R News 4/1.

 library(chron)
 tt - times(c('00:14:15', '00:16:45'))
 summary(tt)
Min.  1st Qu.   Median Mean  3rd Qu. Max.
00:14:15 00:14:52 00:15:30 00:15:30 00:16:08 00:16:45
 min(tt); max(tt); mean(tt)
[1] 00:14:15
[1] 00:16:45
[1] 00:15:30

 rank(tt)
[1] 1 2
 order(tt)
[1] 1 2
 sort(tt)
[1] 00:14:15 00:16:45
 tt[order(tt)]
[1] 00:14:15 00:16:45

 after different events. I need to perform some simple analyses using the 
 times. I think I've figured out how to do this. I just want to confirm with 
 others who have more experience that this is indeed the correct approach. If 
 not, please suggest a more appropriate way.

 Suppose I have times for two athletes after event 1.

 times - c('14:15', '16:45')

 Now, I use strptime() as follows

  x - strptime(times, %M:%S)
 x
 [1] 2010-04-02 00:14:15 2010-04-02 00:16:45

 class(x)
 [1] POSIXt  POSIXlt

 Now, I want the average time across all athletes as well as the min and max, 
 so I do:

 mean(x); min(x); max(x)
 [1] 2010-04-02 00:15:30 EDT
 [1] 2010-04-02 00:14:15 EDT
 [1] 2010-04-02 00:16:45 EDT

 Now, I want to rank order the athletes:

 rank(x)
 Error in if (xi == xj) 0L else if (xi  xj) 1L else -1L :
  missing value where TRUE/FALSE needed

 But, I can rank order the following.

 rank(times)
 [1] 1 2

 I don't need the date in the object x, but I can't figure out how to remove 
 it. Nonetheless, it doesn't seem to affect anything.

 x
 [1] 2010-04-02 00:14:15 2010-04-02 00:16:45

 Is this the right approach for using time variables and performing some 
 computations on them? Or, is there another approach I should look at.

 Thanks,
 Harold

 sessionInfo()
 R version 2.10.0 (2009-10-26)
 i386-pc-mingw32

 locale:
 [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
 [5] LC_TIME=English_United States.1252

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 other attached packages:
 [1] lme4_0.999375-32   Matrix_0.999375-31 lattice_0.17-26

 loaded via a namespace (and not attached):
 [1] grid_2.10.0  tools_2.10.0

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Merge failure using zoo package

2010-04-02 Thread e-letter
Readers,

Please refer to attached example data files. It seems that the merge
function fails for the latter section of the data set. Command
terminal output:

 library(chron)
 library(zoo)
 x-read.zoo(test1.csv,header=TRUE,sep=,,FUN=times)
 y-read.zoo(test2.csv,header=TRUE,sep=,,FUN=times)
 z-(na.approx(merge(x[,2],y[,2]),time(z1)))
 z
x[, 2]y[, 2]
01:01:01 0.5418645 0.1755847
01:01:30 0.3486081 0.2068249
01:01:42 0.4808362 0.2380651
01:02:00 0.6130642 0.4983712
01:02:23 0.3140116 0.7586773
01:19:00 0.8545863 0.8927112
01:24:00 0.965 0.1490374

To overcome this behaviour the files test3 and test4 were created by
removing data that had been merged previously. Command terminal output
below:

 x-read.zoo(test3.csv,header=TRUE,sep=,,FUN=times)
 y-read.zoo(test4.csv,header=TRUE,sep=,,FUN=times)
 z-(na.approx(merge(x[,2],y[,2]),time(z1)))
 z
x[, 2]y[, 2]
01:03:06 0.4827475 0.7350236
01:03:30 0.6951390 0.8376028
01:03:50 0.5798283 0.9401821
01:04:00 0.4645176 0.8330635
01:04:30 0.6167257 0.7259450
01:19:00 0.8545863 0.8927112
01:24:00 0.965 0.1490374

The only way to obtain a more complete merge of the data sets is to
create manually new files where previously merged data is removed and
then put all the merged data into a new file. Surely this package
should merge the data sets completely?

yours,

rh...@conference.jabber.org
r251
mandriva2008
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summing data based on certain conditions

2010-04-02 Thread Phil Spector

Steve -
   Take a closer look at the help page for ave(), especially
the ... argument.  Try

data_ave - ave(data$rammday, data$month, data$year,FUN=mean)

(Assuming you want to calculate the mean -- your example 
didn't specify a function.)


- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Fri, 2 Apr 2010, Steve Murray wrote:



Dear all,

Thanks for the contributions so far. I've had a look at these and the closest 
I've come to solving it is the following:


data_ave - ave(data$rammday, by=c(data$month, data$year))

Warning messages:
1: In split.default(x, g) :
  data length is not a multiple of split variable
2: In split.default(seq_along(x), f, drop = drop, ...) :
  data length is not a multiple of split variable


I'm slightly confused by the warning message, as the data lengths do appear the 
same:


dim(data)

[1] 1073    6

length(data$year)

[1] 1073

length(data$month)

[1] 1073


Maybe the approach I'm taking is wrong. Any suggestions would be gratefully 
received.

Many thanks,

Steve




Date: Wed, 31 Mar 2010 23:31:25 +0200
From: stephan.kola...@gmx.de
To: smurray...@hotmail.com
CC: r-help@r-project.org
Subject: Re: [R] Summing data based on certain conditions

?by may also be helpful.

Stephan


Steve Murray schrieb:

Dear all,

I have a dataset of 1073 rows, the first 15 which look as follows:


data[1:15,]

date year month day rammday thmmday
1 3/8/1988 1988 3 8 1.43 0.94
2 3/15/1988 1988 3 15 2.86 0.66
3 3/22/1988 1988 3 22 5.06 3.43
4 3/29/1988 1988 3 29 18.76 10.93
5 4/5/1988 1988 4 5 4.49 2.70
6 4/12/1988 1988 4 12 8.57 4.59
7 4/16/1988 1988 4 16 31.18 22.18
8 4/19/1988 1988 4 19 19.67 12.33
9 4/26/1988 1988 4 26 3.14 1.79
10 5/3/1988 1988 5 3 11.51 6.33
11 5/10/1988 1988 5 10 5.64 2.89
12 5/17/1988 1988 5 17 37.46 20.89
13 5/24/1988 1988 5 24 9.86 9.81
14 5/31/1988 1988 5 31 13.00 8.63
15 6/7/1988 1988 6 7 0.43 0.00


I am looking for a way by which I can create monthly totals of rammday 
(rainfall in mm/day; column 5) by doing the following:

For each case where the month value and the year are the same (e.g. 3 and 1988, 
in the first four rows), find the mean of the the corresponding rammday values 
and then times by the number of days in that month (i.e. 31 in this case).

Note however that the number of month values in each case isn't always the same 
(e.g. in this subset of data, there are 4 values for month 3, 5 for month 4 and 
5 for month 5). Also the months will of course recycle for the following years, 
so it's not simply a case of finding a monthly total for *all* the 3s in the 
whole dataset, just those associated with each year in turn.

How would I go about doing this in R?

Any help will be gratefully received.

Many thanks,

Steve



_
We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





_

Do you have a story that started on Hotmail? Tell us now
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] POSIX primer

2010-04-02 Thread Doran, Harold
Beautiful. Thank you.

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Friday, April 02, 2010 10:59 AM
To: Doran, Harold
Cc: r-help@r-project.org
Subject: Re: [R] POSIX primer

On Fri, Apr 2, 2010 at 10:49 AM, Doran, Harold hdo...@air.org wrote:
 I have not used POSIX classes previously and now have a need to use them. I 
 have sports data with times of some athletes

The main reason to use POSIXct is if you need time zones.
If you don't then you might be better off with chron.  See R News 4/1.

 library(chron)
 tt - times(c('00:14:15', '00:16:45'))
 summary(tt)
Min.  1st Qu.   Median Mean  3rd Qu. Max.
00:14:15 00:14:52 00:15:30 00:15:30 00:16:08 00:16:45
 min(tt); max(tt); mean(tt)
[1] 00:14:15
[1] 00:16:45
[1] 00:15:30

 rank(tt)
[1] 1 2
 order(tt)
[1] 1 2
 sort(tt)
[1] 00:14:15 00:16:45
 tt[order(tt)]
[1] 00:14:15 00:16:45

 after different events. I need to perform some simple analyses using the 
 times. I think I've figured out how to do this. I just want to confirm with 
 others who have more experience that this is indeed the correct approach. If 
 not, please suggest a more appropriate way.

 Suppose I have times for two athletes after event 1.

 times - c('14:15', '16:45')

 Now, I use strptime() as follows

  x - strptime(times, %M:%S)
 x
 [1] 2010-04-02 00:14:15 2010-04-02 00:16:45

 class(x)
 [1] POSIXt  POSIXlt

 Now, I want the average time across all athletes as well as the min and max, 
 so I do:

 mean(x); min(x); max(x)
 [1] 2010-04-02 00:15:30 EDT
 [1] 2010-04-02 00:14:15 EDT
 [1] 2010-04-02 00:16:45 EDT

 Now, I want to rank order the athletes:

 rank(x)
 Error in if (xi == xj) 0L else if (xi  xj) 1L else -1L :
  missing value where TRUE/FALSE needed

 But, I can rank order the following.

 rank(times)
 [1] 1 2

 I don't need the date in the object x, but I can't figure out how to remove 
 it. Nonetheless, it doesn't seem to affect anything.

 x
 [1] 2010-04-02 00:14:15 2010-04-02 00:16:45

 Is this the right approach for using time variables and performing some 
 computations on them? Or, is there another approach I should look at.

 Thanks,
 Harold

 sessionInfo()
 R version 2.10.0 (2009-10-26)
 i386-pc-mingw32

 locale:
 [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
 [5] LC_TIME=English_United States.1252

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 other attached packages:
 [1] lme4_0.999375-32   Matrix_0.999375-31 lattice_0.17-26

 loaded via a namespace (and not attached):
 [1] grid_2.10.0  tools_2.10.0

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] roccomp

2010-04-02 Thread joann

Thank you, Ravi,

I have looked at that package, but I don't see any method to compare two ROC 
curves. I beleive the method used by roccomp is based on Delong.

JoAnn

From: Ravi Kulkarni [via R] [ml-node+1748903-1261333028-216...@n4.nabble.com]
Sent: Friday, April 02, 2010 2:57 AM
To: Alvarez, Joann Marie
Subject: Re: roccomp

The ROCR package has methods to compute AUC and related methods. You might want 
to check it out.

Ravi


View message @ http://n4.nabble.com/roccomp-tp1748818p1748903.html
To unsubscribe from roccomp, click here (link removed) =.


-- 
View this message in context: 
http://n4.nabble.com/roccomp-tp1748818p1749257.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge failure using zoo package

2010-04-02 Thread e-letter
Data files test1, ...2, ...3, ...4 respectively.

time1,dataset1
01:01:00,0.73512097
01:01:30,0.34860813
01:02:00,0.61306418
01:02:30,0.01495898
01:03:00,0.27035612
01:03:30,0.69513898
01:04:00,0.46451758
01:04:30,0.61672569
01:05:00,0.82496122
01:05:30,0.34766154
01:06:00,0.69618714
01:06:30,0.39035214
01:07:00,0.01680143
01:07:30,0.28576967
01:08:00,0.01205416
01:08:30,0.89637254
01:09:00,0.63147653
01:09:30,0.01522139
01:10:00,0.27661960
01:10:30,0.50974124
01:11:00,0.68141977
01:11:30,0.90725854
01:12:00,0.83823443
01:12:30,0.53360241
01:13:00,0.17769196
01:13:30,0.83438616
01:14:00,0.67248807
01:14:30,0.09991933
01:15:00,0.03334966
01:15:30,0.93292355
01:16:00,0.15990837
01:16:30,0.05354050
01:17:00,0.55281203
01:17:30,0.37845690
01:18:00,0.89051365
01:18:30,0.16674292
01:19:00,0.85458626
01:19:30,0.19278550
01:20:00,0.73240405
01:20:30,0.16417524
01:21:00,0.73878212
01:21:30,0.51790118
01:22:00,0.83076438
01:22:30,0.4704
01:23:00,0.02108640
01:23:30,0.82911053
01:24:00,0.9646
01:24:30,0.14493657
01:25:00,0.84422332
01:25:30,0.41589974
01:26:00,0.67606367
01:26:30,0.00606434
01:27:00,0.59951991
01:27:30,0.43949260
01:28:00,0.66297385
01:28:30,0.33131298
01:29:00,0.06102041
01:29:30,0.84722118
01:30:00,0.46841491
01:30:30,0.34200755
01:31:00,0.87386578
01:31:30,0.70737403
01:32:00,0.23978781
01:32:30,0.11787278
01:33:00,0.14679814
01:33:30,0.65217063
01:34:00,0.81355908
01:34:30,0.31583482
01:35:00,0.92167666
01:35:30,0.55931271
01:36:00,0.13641271
01:36:30,0.35048575
01:37:00,0.17243584
01:37:30,0.93645686
01:38:00,0.85356548
01:38:30,0.61399352
01:39:00,0.05910707
01:39:30,0.01721605
01:40:00,0.94845557
01:40:30,0.48117810
01:41:00,0.34752402
01:41:30,0.59295472
01:42:00,0.64267429
01:42:30,0.57859933
01:43:00,0.00201441
01:43:30,0.32530995
01:44:00,0.25474645
01:44:30,0.93187534
01:45:00,0.99361033
01:45:30,0.16591641

time2,dataset2
01:01:01,0.17558467
01:01:42,0.23806514
01:02:23,0.75867726
01:03:06,0.73502357
01:03:50,0.94018206
01:04:35,0.61882643
01:05:21,0.68417492
01:06:08,0.05744461
01:06:55,0.33344394
01:07:44,0.68752593
01:08:33,0.17270469
01:09:23,0.81522124
01:10:03,0.68304352
01:10:43,0.38774082
01:11:23,0.84176890
01:12:04,0.0936
01:12:44,0.13431965
01:13:25,0.92210721
01:14:06,0.33630635
01:14:47,0.56690294
01:15:29,0.09870816
01:16:11,0.77864105
01:16:53,0.61803441
01:17:35,0.09133728
01:18:17,0.08925487
01:19:00,0.89271117
01:19:42,0.56605742
01:20:25,0.98520534
01:21:08,0.66104843
01:21:51,0.96948589
01:22:34,0.05692690
01:23:17,0.71887456
01:24:00,0.14903741
01:24:43,0.86569445
01:25:26,0.27923513
01:26:09,0.98365033
01:26:53,0.08308399
01:27:36,0.87071027
01:28:19,0.26475705
01:29:03,0.76409811
01:29:47,0.59563256
01:30:31,0.23995054
01:31:14,0.00951054
01:31:59,0.21367270

time1,dataset1
01:02:30,0.01495898
01:03:00,0.27035612
01:03:30,0.69513898
01:04:00,0.46451758
01:04:30,0.61672569
01:05:00,0.82496122
01:05:30,0.34766154
01:06:00,0.69618714
01:06:30,0.39035214
01:07:00,0.01680143
01:07:30,0.28576967
01:08:00,0.01205416
01:08:30,0.89637254
01:09:00,0.63147653
01:09:30,0.01522139
01:10:00,0.27661960
01:10:30,0.50974124
01:11:00,0.68141977
01:11:30,0.90725854
01:12:00,0.83823443
01:12:30,0.53360241
01:13:00,0.17769196
01:13:30,0.83438616
01:14:00,0.67248807
01:14:30,0.09991933
01:15:00,0.03334966
01:15:30,0.93292355
01:16:00,0.15990837
01:16:30,0.05354050
01:17:00,0.55281203
01:17:30,0.37845690
01:18:00,0.89051365
01:18:30,0.16674292
01:19:00,0.85458626
01:19:30,0.19278550
01:20:00,0.73240405
01:20:30,0.16417524
01:21:00,0.73878212
01:21:30,0.51790118
01:22:00,0.83076438
01:22:30,0.4704
01:23:00,0.02108640
01:23:30,0.82911053
01:24:00,0.9646
01:24:30,0.14493657
01:25:00,0.84422332
01:25:30,0.41589974
01:26:00,0.67606367
01:26:30,0.00606434
01:27:00,0.59951991
01:27:30,0.43949260
01:28:00,0.66297385
01:28:30,0.33131298
01:29:00,0.06102041
01:29:30,0.84722118
01:30:00,0.46841491
01:30:30,0.34200755
01:31:00,0.87386578
01:31:30,0.70737403
01:32:00,0.23978781
01:32:30,0.11787278
01:33:00,0.14679814
01:33:30,0.65217063
01:34:00,0.81355908
01:34:30,0.31583482
01:35:00,0.92167666
01:35:30,0.55931271
01:36:00,0.13641271
01:36:30,0.35048575
01:37:00,0.17243584
01:37:30,0.93645686
01:38:00,0.85356548
01:38:30,0.61399352
01:39:00,0.05910707
01:39:30,0.01721605
01:40:00,0.94845557
01:40:30,0.48117810
01:41:00,0.34752402
01:41:30,0.59295472
01:42:00,0.64267429
01:42:30,0.57859933
01:43:00,0.00201441
01:43:30,0.32530995
01:44:00,0.25474645
01:44:30,0.93187534
01:45:00,0.99361033
01:45:30,0.16591641

time2,dataset2
01:03:06,0.73502357
01:03:50,0.94018206
01:04:35,0.61882643
01:05:21,0.68417492
01:06:08,0.05744461
01:06:55,0.33344394
01:07:44,0.68752593
01:08:33,0.17270469
01:09:23,0.81522124
01:10:03,0.68304352
01:10:43,0.38774082
01:11:23,0.84176890
01:12:04,0.0936
01:12:44,0.13431965
01:13:25,0.92210721
01:14:06,0.33630635
01:14:47,0.56690294
01:15:29,0.09870816
01:16:11,0.77864105
01:16:53,0.61803441
01:17:35,0.09133728
01:18:17,0.08925487
01:19:00,0.89271117

Re: [R] Cross-validation for parameter selection (glm/logit)

2010-04-02 Thread JLucke
Jay
Unless I have misunderstood some statistical subtleties, you can use the 
AIC in place of actual cross-validation, as the AIC is asymptotically 
equivalent to leave-out-one cross-validation under MLE.
Joe

Stone, M.
An asymptotic equivalence of choice of model by cross-validation and 
Akaike's criterion
Journal of the Royal Statistical Society. Series B (Methodological), 1977, 
39, 44-47
Abstract: A logarithmic assessment of the performance of a predicting 
density is found to lead to asymptotic equivalence of choice of model by 
cross-validation and Akaike's criterion, when maximum likelihood 
estimation is used within each model. 





Jay josip.2...@gmail.com 
Sent by: r-help-boun...@r-project.org
04/02/2010 09:14 AM

To
r-help@r-project.org
cc

Subject
[R] Cross-validation for parameter selection (glm/logit)






If my aim is to select a good subset of parameters for my final logit
model built using glm(). What is the best way to cross-validate the
results so that they are reliable?

Let's say that I have a large dataset of 1000's of observations. I
split this data into two groups, one that I use for training and
another for validation. First I use the training set to build a model,
and the the stepAIC() with a Forward-Backward search. BUT, if I base
my parameter selection purely on this result, I suppose it will be
somewhat skewed due to the 1-time data split (I use only 1 training
dataset)

What is the correct way to perform this variable selection? And are
the readily available packages for this?

Similarly, when I have my final parameter set, how should I go about
and make the final assessment of the models predictability? CV? What
package?


Thank you in advance,
Jay

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summing data based on certain conditions

2010-04-02 Thread David Winsemius


On Apr 2, 2010, at 10:36 AM, Steve Murray wrote:



Dear all,

Thanks for the contributions so far. I've had a look at these and  
the closest I've come to solving it is the following:



data_ave - ave(data$rammday, by=c(data$month, data$year))

Warning messages:
1: In split.default(x, g) :
  data length is not a multiple of split variable
2: In split.default(seq_along(x), f, drop = drop, ...) :
  data length is not a multiple of split variable


I'm slightly confused by the warning message, as the data lengths do  
appear the same:



dim(data)

[1] 10736

length(data$year)

[1] 1073

length(data$month)

[1] 1073


All, true no doubt, but did you look at

length (c(data$month, data$year) )  # ??

--
David.



Maybe the approach I'm taking is wrong. Any suggestions would be  
gratefully received.


Many thanks,

Steve




Date: Wed, 31 Mar 2010 23:31:25 +0200
From: stephan.kola...@gmx.de
To: smurray...@hotmail.com
CC: r-help@r-project.org
Subject: Re: [R] Summing data based on certain conditions

?by may also be helpful.

Stephan


Steve Murray schrieb:

Dear all,

I have a dataset of 1073 rows, the first 15 which look as follows:


data[1:15,]

date year month day rammday thmmday
1 3/8/1988 1988 3 8 1.43 0.94
2 3/15/1988 1988 3 15 2.86 0.66
3 3/22/1988 1988 3 22 5.06 3.43
4 3/29/1988 1988 3 29 18.76 10.93
5 4/5/1988 1988 4 5 4.49 2.70
6 4/12/1988 1988 4 12 8.57 4.59
7 4/16/1988 1988 4 16 31.18 22.18
8 4/19/1988 1988 4 19 19.67 12.33
9 4/26/1988 1988 4 26 3.14 1.79
10 5/3/1988 1988 5 3 11.51 6.33
11 5/10/1988 1988 5 10 5.64 2.89
12 5/17/1988 1988 5 17 37.46 20.89
13 5/24/1988 1988 5 24 9.86 9.81
14 5/31/1988 1988 5 31 13.00 8.63
15 6/7/1988 1988 6 7 0.43 0.00


I am looking for a way by which I can create monthly totals of  
rammday (rainfall in mm/day; column 5) by doing the following:


For each case where the month value and the year are the same  
(e.g. 3 and 1988, in the first four rows), find the mean of the  
the corresponding rammday values and then times by the number of  
days in that month (i.e. 31 in this case).


Note however that the number of month values in each case isn't  
always the same (e.g. in this subset of data, there are 4 values  
for month 3, 5 for month 4 and 5 for month 5). Also the months  
will of course recycle for the following years, so it's not simply  
a case of finding a monthly total for *all* the 3s in the whole  
dataset, just those associated with each year in turn.


How would I go about doing this in R?

Any help will be gratefully received.

Many thanks,

Steve



_
We want to hear all your funny, exciting and crazy Hotmail  
stories. Tell us now


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





_

Do you have a story that started on Hotmail? Tell us now
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge failure using zoo package

2010-04-02 Thread Gabor Grothendieck
The files only have one data column.  What is the meaning of x[,2],
etc. ?   What is z1?

Please provide reproducible code and data all in a single file using
this style so its clear what is what. Also please cut down the size of
your data to the smallest size that will still illustrate the problem.

Lines1 - a,b
1,2
3,4

library(zoo)
library(chron)
z1 - read.zoo(textConnection(Lines1), header = TRUE, sep = ,, FUN = ...)
etc.


On Fri, Apr 2, 2010 at 10:55 AM, e-letter inp...@gmail.com wrote:
 Readers,

 Please refer to attached example data files. It seems that the merge
 function fails for the latter section of the data set. Command
 terminal output:

 library(chron)
 library(zoo)
 x-read.zoo(test1.csv,header=TRUE,sep=,,FUN=times)
 y-read.zoo(test2.csv,header=TRUE,sep=,,FUN=times)
 z-(na.approx(merge(x[,2],y[,2]),time(z1)))
 z
            x[, 2]    y[, 2]
 01:01:01 0.5418645 0.1755847
 01:01:30 0.3486081 0.2068249
 01:01:42 0.4808362 0.2380651
 01:02:00 0.6130642 0.4983712
 01:02:23 0.3140116 0.7586773
 01:19:00 0.8545863 0.8927112
 01:24:00 0.965 0.1490374

 To overcome this behaviour the files test3 and test4 were created by
 removing data that had been merged previously. Command terminal output
 below:

 x-read.zoo(test3.csv,header=TRUE,sep=,,FUN=times)
 y-read.zoo(test4.csv,header=TRUE,sep=,,FUN=times)
 z-(na.approx(merge(x[,2],y[,2]),time(z1)))
 z
            x[, 2]    y[, 2]
 01:03:06 0.4827475 0.7350236
 01:03:30 0.6951390 0.8376028
 01:03:50 0.5798283 0.9401821
 01:04:00 0.4645176 0.8330635
 01:04:30 0.6167257 0.7259450
 01:19:00 0.8545863 0.8927112
 01:24:00 0.965 0.1490374

 The only way to obtain a more complete merge of the data sets is to
 create manually new files where previously merged data is removed and
 then put all the merged data into a new file. Surely this package
 should merge the data sets completely?

 yours,

 rh...@conference.jabber.org
 r251
 mandriva2008

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Exporting Nuopt from splus to R

2010-04-02 Thread Ben Bolker
Jp2010 mandans_p at yahoo.com writes:

From my understanding it is going to be difficult, is that my understanding
 right.?

  Probably impossible ...
  TIBCO, or whoever owns S-PLUS now (I don't pay much attention, so
it's hard for me to keep track) does try to achieve as much R-compatibility
as possible: see http://csan.insightful.com/doc/spluspackages.pdf .  If
your use of NUOPT is mission-critical, or if the price of S-PLUS is
not prohibitive, it might make sense to keep using S-PLUS to some
extent.

  If you can't do that, take a look at

http://cran.r-project.org/web/views/Optimization.html

 to see if R can do the things you are using NUOPT for.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge failure using zoo package

2010-04-02 Thread e-letter
On 02/04/2010, Gabor Grothendieck ggrothendi...@gmail.com wrote:
 The files only have one data column.  What is the meaning of x[,2],
 etc. ?   What is z1?

I only want to merge one column from one file with one column from
another file. With [x,2], I am trying to select the column of data.

 Please provide reproducible code and data all in a single file using
 this style so its clear what is what. Also please cut down the size of
 your data to the smallest size that will still illustrate the problem.

See other posting; each file that I used is separated by an empty
line. This error seems to occur with the data set size as shown in the
files.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R abrupt exit

2010-04-02 Thread jacob
Dear Lists:

I recently ran quite annoyance problem while running R on Ubuntu 9.10.
 When running the  program,  the system suddenly exit from  the R
session with the following warnings:

 #
OMP: Hint: This may cause performance degradation and correctness
issues. Set environment variable KMP_DUPLICATE_LIB_OK=TRUE to ignore
this problem and force the program to continue anyway. Please note
that the use of KMP_DUPLICATE_LIB_OK is unsupported and using it may
cause undefined behavior. For more information, please contact
Intel(R) Premier Support.
Aborted
##
I have to restart R again, and all the calculations are being lost.

According to the warnings, I set the environment to true, no good. I
reinstalled the R program again, no good either.   I googled the
problem, it seems that there was no R help on the topic so far.


Any suggestions that the whole OS may have conflicts?  I have only one
copy of the following file on my computer system?

/usr/lib/R/lib/libguide.so


Thanks.

jacob

#
options(KMP_DUPLICATE_LIB_OK)
$KMP_DUPLICATE_LIB_OK
[1] TRUE
 sessionInfo()
R version 2.10.1 (2009-12-14)
x86_64-pc-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] datasets  grDevices splines   graphics  stats utils methods
[8] base

other attached packages:
 [1] Design_2.3-0 Hmisc_3.7-0  survival_2.35-8
 [4] GEOquery_2.11.3  RCurl_1.4-0  bitops_1.0-4.1
 [7] affy_1.24.2  Biobase_2.6.1preprocessCore_1.8.0
[10] R.methodsS3_1.0.3

loaded via a namespace (and not attached):
[1] affyio_1.14.0  cluster_1.12.1 grid_2.10.1lattice_0.18-3

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mcmcglmm starting value example

2010-04-02 Thread Jarrod Hadfield

Dear Ping,

It is not possible to pass starting values for the fixed effects. It  
doesn't make much sense to give starting values for the fixed effects  
because they can be Gibbs sampled in a single pass conditional on the  
latent variables and the (co)variance components - after a single  
iteration they would forget their starting values.  The cut points  
in an ordinal model are a different matter. At the moment I do not  
allow user defined starting values for the cut points, but agree that  
it may be useful. I am about to release an update that allows spline  
fitting in MCMCglmm. If it is starting values for the cut points  
you're really after I can add that in before I release?


Cheers,

Jarrod


On 29 Mar 2010, at 22:06, ping chen wrote:


Hi R-users:

Can anyone give an example of giving starting values for MCMCglmm?
I can't find any anywhere.
I have 1 random effect (physicians, and there are 50 of them)
and family=ordinal.

How can I specify starting values for my fixed effects? It doesn't  
seem to have the option to do so.


Thanks, Ping







--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge failure using zoo package

2010-04-02 Thread Gabor Grothendieck
The code does not run with the files.  I need the requested
information, namely a single file containing code and data and that I
can just copy into a session without editing and see the result you
see.

On Fri, Apr 2, 2010 at 11:27 AM, e-letter inp...@gmail.com wrote:
 On 02/04/2010, Gabor Grothendieck ggrothendi...@gmail.com wrote:
 The files only have one data column.  What is the meaning of x[,2],
 etc. ?   What is z1?

 I only want to merge one column from one file with one column from
 another file. With [x,2], I am trying to select the column of data.

 Please provide reproducible code and data all in a single file using
 this style so its clear what is what. Also please cut down the size of
 your data to the smallest size that will still illustrate the problem.

 See other posting; each file that I used is separated by an empty
 line. This error seems to occur with the data set size as shown in the
 files.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Exporting Nuopt from splus to R

2010-04-02 Thread Uwe Ligges



On 02.04.2010 01:16, Jp2010 wrote:


Hi all,
Thanks for the wonderful forum with all the valuable help and comments here.

I have been a splus user for the past 7 to 8 years and now crossing the mind
of changing over to R. Have been doing a lot of reading and one of the main
reasons is being an open source and the wonderful things that comes with
that.

My question is though, is it possible to export any of the function or
librarys that come with splus to R.?

For my specific situation. Windows platform, if there is a compiled s.dll is
there a way we can get this working in R. I would think if it s function or
source file it probably can be written without much difficulty in R. But
what about the compiled data. I am not a system programmer so don't know
much about compiling/ undoing that.


From my understanding it is going to be difficult, is that my understanding

right.?

Thanks



If you are talking abouit an already compiled dll, it is not possible to 
get it working. You need to recompile the sources and createw some new 
library.


Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge failure using zoo package

2010-04-02 Thread e-letter
On 02/04/2010, Gabor Grothendieck ggrothendi...@gmail.com wrote:
 The code does not run with the files.  I need the requested
 information, namely a single file containing code and data and that I
 can just copy into a session without editing and see the result you
 see.

I don't understand how I can combine the four csv files into a single
file of data and terminal commands? Anyway, further terminal output.

The following also occurs with correction of the commands. The data
merge is incomplete (to 1:27:30); data set 1 ends at time 1:45:30;
data set 2 1:31:59

 library(chron)
 library(zoo)
 z1-read.zoo(test1.csv,header=TRUE,sep=,,FUN=times)
 z2-read.zoo(test2.csv,header=TRUE,sep=,,FUN=times)
 z3-(na.approx(merge(z1[,2],z2[,2]),time(z1)))
 z3
z1[, 2]z2[, 2]
01:01:01 0.54186455 0.17558467
01:01:30 0.34860813 0.20682491
01:01:42 0.48083615 0.23806514
01:02:00 0.61306418 0.49837120
01:02:23 0.31401158 0.75867726
01:02:30 0.01495898 0.75079270
01:03:00 0.27035612 0.74290813
01:03:06 0.48274755 0.73502357
01:03:30 0.69513898 0.83760282
01:03:50 0.57982828 0.94018206
01:04:00 0.46451758 0.83306352
01:04:30 0.61672569 0.72594497
01:04:35 0.72084346 0.61882643
01:05:00 0.82496122 0.65150068
01:05:21 0.58631138 0.68417492
01:05:30 0.34766154 0.47526482
01:06:00 0.69618714 0.26635471
01:06:08 0.54326964 0.05744461
01:06:30 0.39035214 0.19544428
01:06:55 0.20357679 0.33344394
01:07:00 0.01680143 0.45147127
01:07:30 0.28576967 0.56949860
01:07:44 0.14891191 0.68752593
01:08:00 0.01205416 0.51591885
01:08:30 0.89637254 0.34431177
01:08:33 0.76392454 0.17270469
01:09:00 0.63147653 0.49396296
01:09:23 0.32334896 0.81522124
01:09:30 0.01522139 0.77116200
01:10:00 0.27661960 0.72710276
01:10:03 0.39318042 0.68304352
01:10:30 0.50974124 0.53539217
01:10:43 0.59558051 0.38774082
01:11:00 0.68141977 0.61475486
01:11:23 0.79433915 0.84176890
01:11:30 0.90725854 0.59232742
01:12:00 0.83823443 0.34288594
01:12:04 0.68591842 0.0936
01:12:30 0.53360241 0.11388206
01:12:44 0.35564718 0.13431965
01:13:00 0.17769196 0.52821343
01:13:25 0.50603906 0.92210721
01:13:30 0.83438616 0.72684026
01:14:00 0.67248807 0.53157330
01:14:06 0.38620370 0.33630635
01:14:30 0.09991933 0.45160464
01:14:47 0.06663450 0.56690294
01:15:00 0.03334966 0.33280555
01:15:29 0.48313660 0.09870816
01:15:30 0.93292355 0.32535246
01:16:00 0.15990837 0.55199675
01:16:11 0.10672443 0.77864105
01:16:30 0.05354050 0.69833773
01:16:53 0.30317627 0.61803441
01:17:00 0.55281203 0.44246870
01:17:30 0.37845690 0.26690299
01:17:35 0.63448528 0.09133728
01:18:00 0.89051365 0.09029608
01:18:17 0.52862829 0.08925487
01:18:30 0.16674292 0.49098302
01:19:00 0.85458626 0.89271117
01:19:30 0.19278550 0.72938430
01:19:42 0.46259477 0.56605742
01:20:00 0.73240405 0.77563138
01:20:25 0.44828965 0.98520534
01:20:30 0.16417524 0.87715304
01:21:00 0.73878212 0.76910073
01:21:08 0.62834165 0.66104843
01:21:30 0.51790118 0.81526716
01:21:51 0.67433278 0.96948589
01:22:00 0.83076438 0.66529956
01:22:30 0.4704 0.36111323
01:22:34 0.24832072 0.05692690
01:23:00 0.02108640 0.38790073
01:23:17 0.42509847 0.71887456
01:23:30 0.82911053 0.43395599
01:24:00 0.9646 0.14903741
01:24:30 0.14493657 0.50736593
01:24:43 0.49457995 0.86569445
01:25:00 0.84422332 0.57246479
01:25:26 0.63006153 0.27923513
01:25:30 0.41589974 0.51404020
01:26:00 0.67606367 0.74884526
01:26:09 0.34106401 0.98365033
01:26:30 0.00606434 0.53336716
01:26:53 0.30279212 0.08308399
01:27:00 0.59951991 0.34562608
01:27:30 0.43949260 0.60816818


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R abrupt exit

2010-04-02 Thread Peter Ehlers

Google leads to some discussion on the Intel Sofware Network:

 http://software.intel.com/en-us/forums/showthread.php?t=64585

Be warned: I haven't read the discussion.

 -Peter Ehlers

On 2010-04-02 9:30, jacob wrote:

Dear Lists:

I recently ran quite annoyance problem while running R on Ubuntu 9.10.
  When running the  program,  the system suddenly exit from  the R
session with the following warnings:

 #
OMP: Hint: This may cause performance degradation and correctness
issues. Set environment variable KMP_DUPLICATE_LIB_OK=TRUE to ignore
this problem and force the program to continue anyway. Please note
that the use of KMP_DUPLICATE_LIB_OK is unsupported and using it may
cause undefined behavior. For more information, please contact
Intel(R) Premier Support.
Aborted
##
I have to restart R again, and all the calculations are being lost.

According to the warnings, I set the environment to true, no good. I
reinstalled the R program again, no good either.   I googled the
problem, it seems that there was no R help on the topic so far.


Any suggestions that the whole OS may have conflicts?  I have only one
copy of the following file on my computer system?

/usr/lib/R/lib/libguide.so


Thanks.

jacob

#

options(KMP_DUPLICATE_LIB_OK)

$KMP_DUPLICATE_LIB_OK
[1] TRUE

sessionInfo()

R version 2.10.1 (2009-12-14)
x86_64-pc-linux-gnu

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] datasets  grDevices splines   graphics  stats utils methods
[8] base

other attached packages:
  [1] Design_2.3-0 Hmisc_3.7-0  survival_2.35-8
  [4] GEOquery_2.11.3  RCurl_1.4-0  bitops_1.0-4.1
  [7] affy_1.24.2  Biobase_2.6.1preprocessCore_1.8.0
[10] R.methodsS3_1.0.3

loaded via a namespace (and not attached):
[1] affyio_1.14.0  cluster_1.12.1 grid_2.10.1lattice_0.18-3

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge failure using zoo package

2010-04-02 Thread Gabor Grothendieck
Below is the format that was requested.  This has the data followed by
the corrected code at the end.  There are several things that were
wrong:

1. z1[,2] is wrong since z1 is a vector, not a 2d matrix.  Ditto for
z2.  Ideally zoo would have given an error message but in any case its
wrong.  It should be just z1.
2. The merge works ok but the second argument to na.approx should be
xout=time(z1).   xout= was missing.
3. there were a number of bugs removed from na.approx in the devel
version of zoo.  I don't think they impact this but if you have any
problems with na.approx then uncomment the source statement in the
code below. It will bring in the development version of na.approx into
your workspace.

Lines1 - time1,dataset1
01:01:00,0.73512097
01:01:30,0.34860813
01:02:00,0.61306418
01:02:30,0.01495898
01:03:00,0.27035612
01:03:30,0.69513898
01:04:00,0.46451758
01:04:30,0.61672569
01:05:00,0.82496122
01:05:30,0.34766154
01:06:00,0.69618714
01:06:30,0.39035214
01:07:00,0.01680143
01:07:30,0.28576967
01:08:00,0.01205416
01:08:30,0.89637254
01:09:00,0.63147653
01:09:30,0.01522139
01:10:00,0.27661960
01:10:30,0.50974124
01:11:00,0.68141977
01:11:30,0.90725854
01:12:00,0.83823443
01:12:30,0.53360241
01:13:00,0.17769196
01:13:30,0.83438616
01:14:00,0.67248807
01:14:30,0.09991933
01:15:00,0.03334966
01:15:30,0.93292355
01:16:00,0.15990837
01:16:30,0.05354050
01:17:00,0.55281203
01:17:30,0.37845690
01:18:00,0.89051365
01:18:30,0.16674292
01:19:00,0.85458626
01:19:30,0.19278550
01:20:00,0.73240405
01:20:30,0.16417524
01:21:00,0.73878212
01:21:30,0.51790118
01:22:00,0.83076438
01:22:30,0.4704
01:23:00,0.02108640
01:23:30,0.82911053
01:24:00,0.9646
01:24:30,0.14493657
01:25:00,0.84422332
01:25:30,0.41589974
01:26:00,0.67606367
01:26:30,0.00606434
01:27:00,0.59951991
01:27:30,0.43949260
01:28:00,0.66297385
01:28:30,0.33131298
01:29:00,0.06102041
01:29:30,0.84722118
01:30:00,0.46841491
01:30:30,0.34200755
01:31:00,0.87386578
01:31:30,0.70737403
01:32:00,0.23978781
01:32:30,0.11787278
01:33:00,0.14679814
01:33:30,0.65217063
01:34:00,0.81355908
01:34:30,0.31583482
01:35:00,0.92167666
01:35:30,0.55931271
01:36:00,0.13641271
01:36:30,0.35048575
01:37:00,0.17243584
01:37:30,0.93645686
01:38:00,0.85356548
01:38:30,0.61399352
01:39:00,0.05910707
01:39:30,0.01721605
01:40:00,0.94845557
01:40:30,0.48117810
01:41:00,0.34752402
01:41:30,0.59295472
01:42:00,0.64267429
01:42:30,0.57859933
01:43:00,0.00201441
01:43:30,0.32530995
01:44:00,0.25474645
01:44:30,0.93187534
01:45:00,0.99361033
01:45:30,0.16591641

Lines2 - time2,dataset2
01:01:01,0.17558467
01:01:42,0.23806514
01:02:23,0.75867726
01:03:06,0.73502357
01:03:50,0.94018206
01:04:35,0.61882643
01:05:21,0.68417492
01:06:08,0.05744461
01:06:55,0.33344394
01:07:44,0.68752593
01:08:33,0.17270469
01:09:23,0.81522124
01:10:03,0.68304352
01:10:43,0.38774082
01:11:23,0.84176890
01:12:04,0.0936
01:12:44,0.13431965
01:13:25,0.92210721
01:14:06,0.33630635
01:14:47,0.56690294
01:15:29,0.09870816
01:16:11,0.77864105
01:16:53,0.61803441
01:17:35,0.09133728
01:18:17,0.08925487
01:19:00,0.89271117
01:19:42,0.56605742
01:20:25,0.98520534
01:21:08,0.66104843
01:21:51,0.96948589
01:22:34,0.05692690
01:23:17,0.71887456
01:24:00,0.14903741
01:24:43,0.86569445
01:25:26,0.27923513
01:26:09,0.98365033
01:26:53,0.08308399
01:27:36,0.87071027
01:28:19,0.26475705
01:29:03,0.76409811
01:29:47,0.59563256
01:30:31,0.23995054
01:31:14,0.00951054
01:31:59,0.21367270

Lines3 - time1,dataset1
01:02:30,0.01495898
01:03:00,0.27035612
01:03:30,0.69513898
01:04:00,0.46451758
01:04:30,0.61672569
01:05:00,0.82496122
01:05:30,0.34766154
01:06:00,0.69618714
01:06:30,0.39035214
01:07:00,0.01680143
01:07:30,0.28576967
01:08:00,0.01205416
01:08:30,0.89637254
01:09:00,0.63147653
01:09:30,0.01522139
01:10:00,0.27661960
01:10:30,0.50974124
01:11:00,0.68141977
01:11:30,0.90725854
01:12:00,0.83823443
01:12:30,0.53360241
01:13:00,0.17769196
01:13:30,0.83438616
01:14:00,0.67248807
01:14:30,0.09991933
01:15:00,0.03334966
01:15:30,0.93292355
01:16:00,0.15990837
01:16:30,0.05354050
01:17:00,0.55281203
01:17:30,0.37845690
01:18:00,0.89051365
01:18:30,0.16674292
01:19:00,0.85458626
01:19:30,0.19278550
01:20:00,0.73240405
01:20:30,0.16417524
01:21:00,0.73878212
01:21:30,0.51790118
01:22:00,0.83076438
01:22:30,0.4704
01:23:00,0.02108640
01:23:30,0.82911053
01:24:00,0.9646
01:24:30,0.14493657
01:25:00,0.84422332
01:25:30,0.41589974
01:26:00,0.67606367
01:26:30,0.00606434
01:27:00,0.59951991
01:27:30,0.43949260
01:28:00,0.66297385
01:28:30,0.33131298
01:29:00,0.06102041
01:29:30,0.84722118
01:30:00,0.46841491
01:30:30,0.34200755
01:31:00,0.87386578
01:31:30,0.70737403
01:32:00,0.23978781
01:32:30,0.11787278
01:33:00,0.14679814
01:33:30,0.65217063
01:34:00,0.81355908
01:34:30,0.31583482
01:35:00,0.92167666
01:35:30,0.55931271
01:36:00,0.13641271
01:36:30,0.35048575
01:37:00,0.17243584
01:37:30,0.93645686
01:38:00,0.85356548
01:38:30,0.61399352
01:39:00,0.05910707
01:39:30,0.01721605
01:40:00,0.94845557

[R] angles

2010-04-02 Thread Tom_R

Hi R users,

I would like to construct a sort hybrid vector/scatter plot.

My data is in the following format:  3-column x,y,z data-frame in which
every row is a separate data-point.
The x  y columns are coordinates, and the z column contains orientation
data (range 0-180 degrees, with East=0  North=90).

I need to set each x,y, point to have the alignment in z. Hence my 'vectors'
would simply be lines with the mid-point at x,y and without arrow-heads. 

R's normal vector plot requires a pair of x,y coords for the start  end of
each vector, whereas I just have an orientation.

Any ideas?

Cheers!
Tom

-- 
View this message in context: http://n4.nabble.com/angles-tp1749321p1749321.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge failure using zoo package

2010-04-02 Thread William Dunlap
 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of e-letter
 Sent: Friday, April 02, 2010 9:20 AM
 To: Gabor Grothendieck
 Cc: r-help@r-project.org
 Subject: Re: [R] Merge failure using zoo package
 
 On 02/04/2010, Gabor Grothendieck ggrothendi...@gmail.com wrote:
  The code does not run with the files.  I need the requested
  information, namely a single file containing code and data 
 and that I
  can just copy into a session without editing and see the result you
  see.
 
 I don't understand how I can combine the four csv files into a single
 file of data and terminal commands?

One way is to use a call to textConnection() instead of a
file name.

E.g., if you show a file called data.txt containing the lines
VarA VarB
   12
   34
and you read that into R with
   data-read.table(header=TRUE, data.txt)
then the R-helper needs to copy the file contents into
an editor, save the file under the appropriate name,
then copy the command into an R session.

However, you can replace the file and the original read.table
command with one command that an R-helper can paste into an
R seesion:

   data - read.table(header=TRUE, textConnection(
VarA VarB
   12
   34
))

Another approach is to use dput(data) to print the
dataset and stick a 'data-' on the front of what
was printed.  E.g.,  with the above 'data' you can do
   dput(data)
  structure(list(VarA = c(1L, 3L), VarB = c(2L, 4L)), .Names = c(VarA,

  VarB), class = data.frame, row.names = c(NA, -2L))
and send R-help the command
  data - 
  structure(list(VarA = c(1L, 3L), VarB = c(2L, 4L)), .Names = c(VarA,

  VarB), class = data.frame, row.names = c(NA, -2L))
You may have to insert some line breaks in sensible positions
so that Outlook or Exchange doesn't break lines in nonsensical
positions.  Again, the R-helper can copy and paste that
code into an R session and come up with a dataset identical
to yours.

(I've seen copy-n-pastable code in R-help that starts with
   remove(list=ls())
The suggestion to remove all objects really puts off R-helpers.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 Anyway, further terminal output.
 
 The following also occurs with correction of the commands. The data
 merge is incomplete (to 1:27:30); data set 1 ends at time 1:45:30;
 data set 2 1:31:59
 
  library(chron)
  library(zoo)
  z1-read.zoo(test1.csv,header=TRUE,sep=,,FUN=times)
  z2-read.zoo(test2.csv,header=TRUE,sep=,,FUN=times)
  z3-(na.approx(merge(z1[,2],z2[,2]),time(z1)))
  z3
 z1[, 2]z2[, 2]
 01:01:01 0.54186455 0.17558467
 01:01:30 0.34860813 0.20682491
 01:01:42 0.48083615 0.23806514
 01:02:00 0.61306418 0.49837120
 01:02:23 0.31401158 0.75867726
 01:02:30 0.01495898 0.75079270
 01:03:00 0.27035612 0.74290813
 01:03:06 0.48274755 0.73502357
 01:03:30 0.69513898 0.83760282
 01:03:50 0.57982828 0.94018206
 01:04:00 0.46451758 0.83306352
 01:04:30 0.61672569 0.72594497
 01:04:35 0.72084346 0.61882643
 01:05:00 0.82496122 0.65150068
 01:05:21 0.58631138 0.68417492
 01:05:30 0.34766154 0.47526482
 01:06:00 0.69618714 0.26635471
 01:06:08 0.54326964 0.05744461
 01:06:30 0.39035214 0.19544428
 01:06:55 0.20357679 0.33344394
 01:07:00 0.01680143 0.45147127
 01:07:30 0.28576967 0.56949860
 01:07:44 0.14891191 0.68752593
 01:08:00 0.01205416 0.51591885
 01:08:30 0.89637254 0.34431177
 01:08:33 0.76392454 0.17270469
 01:09:00 0.63147653 0.49396296
 01:09:23 0.32334896 0.81522124
 01:09:30 0.01522139 0.77116200
 01:10:00 0.27661960 0.72710276
 01:10:03 0.39318042 0.68304352
 01:10:30 0.50974124 0.53539217
 01:10:43 0.59558051 0.38774082
 01:11:00 0.68141977 0.61475486
 01:11:23 0.79433915 0.84176890
 01:11:30 0.90725854 0.59232742
 01:12:00 0.83823443 0.34288594
 01:12:04 0.68591842 0.0936
 01:12:30 0.53360241 0.11388206
 01:12:44 0.35564718 0.13431965
 01:13:00 0.17769196 0.52821343
 01:13:25 0.50603906 0.92210721
 01:13:30 0.83438616 0.72684026
 01:14:00 0.67248807 0.53157330
 01:14:06 0.38620370 0.33630635
 01:14:30 0.09991933 0.45160464
 01:14:47 0.06663450 0.56690294
 01:15:00 0.03334966 0.33280555
 01:15:29 0.48313660 0.09870816
 01:15:30 0.93292355 0.32535246
 01:16:00 0.15990837 0.55199675
 01:16:11 0.10672443 0.77864105
 01:16:30 0.05354050 0.69833773
 01:16:53 0.30317627 0.61803441
 01:17:00 0.55281203 0.44246870
 01:17:30 0.37845690 0.26690299
 01:17:35 0.63448528 0.09133728
 01:18:00 0.89051365 0.09029608
 01:18:17 0.52862829 0.08925487
 01:18:30 0.16674292 0.49098302
 01:19:00 0.85458626 0.89271117
 01:19:30 0.19278550 0.72938430
 01:19:42 0.46259477 0.56605742
 01:20:00 0.73240405 0.77563138
 01:20:25 0.44828965 0.98520534
 01:20:30 0.16417524 0.87715304
 01:21:00 0.73878212 0.76910073
 01:21:08 0.62834165 0.66104843
 01:21:30 0.51790118 0.81526716
 01:21:51 0.67433278 0.96948589
 01:22:00 0.83076438 0.66529956
 01:22:30 0.4704 0.36111323
 01:22:34 0.24832072 0.05692690
 01:23:00 0.02108640 0.38790073
 01:23:17 0.42509847 0.71887456
 01:23:30 

Re: [R] Adding regression lines to each factor on a plot when using ANCOVA

2010-04-02 Thread Michael Friendly
This is a nice example; thanks for providing it in this form.  I tried 
to trim it down to show fewer groups, but ran into the following errors

that I can't understand:

## keep species 1:6
 dataset - subset(dataset, species  7)
Warning message:
In Ops.factor(species, 7) :  not meaningful for factors

## OK, just subset the rows of dataset to keep species 1:6

 dataset - dataset[1:20,]
 ancova(logBeak ~ logMass   * species, data=dataset)
Error in `contrasts-`(`*tmp*`, value = contr.treatment) :
  contrasts can be applied only to factors with 2 or more levels
 ancova(logBeak ~ logMass   + species, data=dataset)
Error in `contrasts-`(`*tmp*`, value = contr.treatment) :
  contrasts can be applied only to factors with 2 or more levels

-Michael


RICHARD M. HEIBERGER wrote:

## Steve,

## please use the ancova function in the HH package.

install.packages(HH)
library(HH)


## windows.options(record=TRUE)
windows.options(record=TRUE)
# hypothetical data
beak.lgth -
c(2.3,4.2,2.7,3.4,4.2,4.8,1.9,2.2,1.7,2.5,15,16.5,14.7,9.6,8.5,9.1,
  9.4,17.7,15.6,14,6.8,8.5,9.4,10.5,10.9,11.2,11.5,19,17.2,18.9,
  19.5,19.9,12.6,12.1,12.9,14.1,12.5,15,14.8,4.3,5.7,2.4,3.5,2.9)
mass -
c(45.9,47.1,47.6,17.2,17.9,17.7,44.9,44.8,45.3,44.9,39,39.7,41.2,
  84.8,79.2,78.3,82.8,102.8,107.2,104.1,51.7,45.5,50.6,27.5,26.6,
  27.5,26.9,25.4,23.7,21.7,22.2,23.8,46.9,51.5,49.4,33.4,33.1,33.2,
  34.7,39.3,41.7,40.5,42.7,41.8)
## Make species into a factor
species -
factor(c(1,1,1,2,2,2,3,3,3,3,4,4,4,5,5,5,5,6,6,6,7,7,7,
 8,8,8,8,9,9,9,9,9,10,10,10,11,11,11,11,12,12,12,12,12))
## then construct a data.frame with the three variables and the log
transforms
dataset -  data.frame(species, beak.lgth, mass,
   logBeak=log10(beak.lgth),
   logMass=log10(mass))
## default is 7 colors, we need 12
trellis.par.set(superpose.line,
  Rows(trellis.par.get(superpose.line), c(1:6, 1:6)))
trellis.par.set(superpose.symbol,
  Rows(trellis.par.get(superpose.symbol), c(1:6, 1:6)))

ancova(logBeak ~ logMass   * species, data=dataset)
ancova(logBeak ~ logMass   + species, data=dataset)
ancova(logBeak ~ logMass, groups=species, data=dataset)
ancova(logBeak ~ species, x=logMass, data=dataset)
bwplot(logBeak ~ species, data=dataset)

## Rich

[[alternative HTML version deleted]]




--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] angles

2010-04-02 Thread Greg Snow
Look at the my.symbols function in the TeachingDemos package.  You can get line 
segments using the ms.arrows function and setting the length argument to 0 (or 
you can make your own plotting function by copying ms.arrows and replacing the 
call to arrows with a call to segments).


Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Tom_R
 Sent: Friday, April 02, 2010 10:13 AM
 To: r-help@r-project.org
 Subject: [R] angles
 
 
 Hi R users,
 
 I would like to construct a sort hybrid vector/scatter plot.
 
 My data is in the following format:  3-column x,y,z data-frame in which
 every row is a separate data-point.
 The x  y columns are coordinates, and the z column contains
 orientation
 data (range 0-180 degrees, with East=0  North=90).
 
 I need to set each x,y, point to have the alignment in z. Hence my
 'vectors'
 would simply be lines with the mid-point at x,y and without arrow-
 heads.
 
 R's normal vector plot requires a pair of x,y coords for the start 
 end of
 each vector, whereas I just have an orientation.
 
 Any ideas?
 
 Cheers!
 Tom
 
 --
 View this message in context: http://n4.nabble.com/angles-
 tp1749321p1749321.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to save a model in DB and retrieve It

2010-04-02 Thread Greg Snow
Look at the serialize function, it may accomplish what you want.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Daniele Amberti
 Sent: Friday, April 02, 2010 2:37 AM
 To: r-help@r-project.org; r-sig...@stat.math.ethz.ch
 Subject: [R] How to save a model in DB and retrieve It
 
 I'm wondering how to save an object (models like lm, loess, etc) in a
 DB to retrieve and use it afterwards, an example:
 
 wind_ms - abs(rnorm(24*30)*4+8)
 air_kgm3 - rnorm(24*30, 0.1)*0.1 + 1.1
 wind_dg - rnorm(24*30) * 360/7
 ms - c(0:25)
 kw_mm92 - c(0,0,0,20,94,205,391,645,979,1375,1795,2000,2040)
 kw_mm92 - c(kw_mm92, rep(2050, length(ms)-length(kw_mm92)))
 modelspline - splinefun(ms, kw_mm92)
 kw - abs(modelspline(wind_ms) - (wind_dg)*2 + (air_kgm3 - 1.15)*300 +
 rnorm(length(wind_ms))*10)
 #plot(wind_ms, kw)
 windDat - data.frame(kw, wind_ms, air_kgm3, wind_dg)
 windDat[windDat$wind_ms  3, 'kw'] - 0
 model - loess(kw ~ wind_ms + air_kgm3 + wind_dg, data = windDat,
 enp.target = 10*5*3) #, span = 0.1)
 
 modX - serialize(model, connection = NULL, ascii = T)
 
 Channel - odbcConnect(someSysDSN; UID=aUid; PWD=aPwd)
 sqlQuery(Channel,
 paste(
 INSERT INTO GRT.GeneratorsModels
([cGeneratorID]
,[tModel]
VALUES
(1,,
paste(', gsub(', '', rawToChar(modX)), ', sep = ''),
), sep = ) )
 # Up to this it is working correctly,
 # in DB I have the modX variable
 # Problem arise retrieving data and 64kb limit:
   strQ - 
 SELECT  CONVERT(varchar(max), tModel) AS tModel
 FROMGRT.GeneratorsModels
 WHERE   (cGeneratorID = 1)
 
 x - sqlQuery(Channel, strQ, stringsAsFactors = F, believeNRows =
 FALSE)
 x - sqlQuery(Channel, strQ, stringsAsFactors = F, believeNRows =
 FALSE) #read error
 
 
 
 Above code is working for simplier models that have a shorter
 representation in variable modX.
 Any advice on how to store and retieve this kind of objects?
 Thanks
 Daniele
 
 
 ORS Srl
 
 Via Agostino Morando 1/3 12060 Roddi (Cn) - Italy
 Tel. +39 0173 620211
 Fax. +39 0173 620299 / +39 0173 433111
 Web Site www.ors.it
 
 ---
 -
 Qualsiasi utilizzo non autorizzato del presente messaggio e dei suoi
 allegati è vietato e potrebbe costituire reato.
 Se lei avesse ricevuto erroneamente questo messaggio, Le saremmo grati
 se provvedesse alla distruzione dello stesso
 e degli eventuali allegati.
 Opinioni, conclusioni o altre informazioni riportate nella e-mail, che
 non siano relative alle attività e/o
 alla missione aziendale di O.R.S. Srl si intendono non  attribuibili
 alla società stessa, né la impegnano in alcun modo.
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merge failure using zoo package

2010-04-02 Thread Gabor Grothendieck
On Fri, Apr 2, 2010 at 1:01 PM, William Dunlap wdun...@tibco.com wrote:
 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of e-letter
 Sent: Friday, April 02, 2010 9:20 AM
 To: Gabor Grothendieck
 Cc: r-help@r-project.org
 Subject: Re: [R] Merge failure using zoo package

 On 02/04/2010, Gabor Grothendieck ggrothendi...@gmail.com wrote:
  The code does not run with the files.  I need the requested
  information, namely a single file containing code and data
 and that I
  can just copy into a session without editing and see the result you
  see.

 I don't understand how I can combine the four csv files into a single
 file of data and terminal commands?

 One way is to use a call to textConnection() instead of a
 file name.

 E.g., if you show a file called data.txt containing the lines
 VarA VarB
   1    2
   3    4
 and you read that into R with
   data-read.table(header=TRUE, data.txt)
 then the R-helper needs to copy the file contents into
 an editor, save the file under the appropriate name,
 then copy the command into an R session.

 However, you can replace the file and the original read.table
 command with one command that an R-helper can paste into an
 R seesion:

   data - read.table(header=TRUE, textConnection(
 VarA VarB
   1    2
   3    4
 ))

I personally rarely use this form since it makes it harder to
transition to the case where you have a file name.  I generally prefer
the textConnection form.


 Another approach is to use dput(data) to print the
 dataset and stick a 'data-' on the front of what
 was printed.  E.g.,  with the above 'data' you can do
   dput(data)
  structure(list(VarA = c(1L, 3L), VarB = c(2L, 4L)), .Names = c(VarA,

  VarB), class = data.frame, row.names = c(NA, -2L))
 and send R-help the command
  data -
  structure(list(VarA = c(1L, 3L), VarB = c(2L, 4L)), .Names = c(VarA,

  VarB), class = data.frame, row.names = c(NA, -2L))

This form is convenient but in some cases it may leave one wondering
how the data came about.  As long as that is not in question then dput
is really nice.

 You may have to insert some line breaks in sensible positions
 so that Outlook or Exchange doesn't break lines in nonsensical
 positions.  Again, the R-helper can copy and paste that
 code into an R session and come up with a dataset identical
 to yours.

 (I've seen copy-n-pastable code in R-help that starts with
   remove(list=ls())
 The suggestion to remove all objects really puts off R-helpers.)

Yes, that is unacceptable code to post.  It can really cause horrible
problems for readers.  I personally never use code like this.


 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

 Anyway, further terminal output.

 The following also occurs with correction of the commands. The data
 merge is incomplete (to 1:27:30); data set 1 ends at time 1:45:30;
 data set 2 1:31:59

  library(chron)
  library(zoo)
  z1-read.zoo(test1.csv,header=TRUE,sep=,,FUN=times)
  z2-read.zoo(test2.csv,header=TRUE,sep=,,FUN=times)
  z3-(na.approx(merge(z1[,2],z2[,2]),time(z1)))
  z3
             z1[, 2]    z2[, 2]
 01:01:01 0.54186455 0.17558467
 01:01:30 0.34860813 0.20682491
 01:01:42 0.48083615 0.23806514
 01:02:00 0.61306418 0.49837120
 01:02:23 0.31401158 0.75867726
 01:02:30 0.01495898 0.75079270
 01:03:00 0.27035612 0.74290813
 01:03:06 0.48274755 0.73502357
 01:03:30 0.69513898 0.83760282
 01:03:50 0.57982828 0.94018206
 01:04:00 0.46451758 0.83306352
 01:04:30 0.61672569 0.72594497
 01:04:35 0.72084346 0.61882643
 01:05:00 0.82496122 0.65150068
 01:05:21 0.58631138 0.68417492
 01:05:30 0.34766154 0.47526482
 01:06:00 0.69618714 0.26635471
 01:06:08 0.54326964 0.05744461
 01:06:30 0.39035214 0.19544428
 01:06:55 0.20357679 0.33344394
 01:07:00 0.01680143 0.45147127
 01:07:30 0.28576967 0.56949860
 01:07:44 0.14891191 0.68752593
 01:08:00 0.01205416 0.51591885
 01:08:30 0.89637254 0.34431177
 01:08:33 0.76392454 0.17270469
 01:09:00 0.63147653 0.49396296
 01:09:23 0.32334896 0.81522124
 01:09:30 0.01522139 0.77116200
 01:10:00 0.27661960 0.72710276
 01:10:03 0.39318042 0.68304352
 01:10:30 0.50974124 0.53539217
 01:10:43 0.59558051 0.38774082
 01:11:00 0.68141977 0.61475486
 01:11:23 0.79433915 0.84176890
 01:11:30 0.90725854 0.59232742
 01:12:00 0.83823443 0.34288594
 01:12:04 0.68591842 0.0936
 01:12:30 0.53360241 0.11388206
 01:12:44 0.35564718 0.13431965
 01:13:00 0.17769196 0.52821343
 01:13:25 0.50603906 0.92210721
 01:13:30 0.83438616 0.72684026
 01:14:00 0.67248807 0.53157330
 01:14:06 0.38620370 0.33630635
 01:14:30 0.09991933 0.45160464
 01:14:47 0.06663450 0.56690294
 01:15:00 0.03334966 0.33280555
 01:15:29 0.48313660 0.09870816
 01:15:30 0.93292355 0.32535246
 01:16:00 0.15990837 0.55199675
 01:16:11 0.10672443 0.77864105
 01:16:30 0.05354050 0.69833773
 01:16:53 0.30317627 0.61803441
 01:17:00 0.55281203 0.44246870
 01:17:30 0.37845690 0.26690299
 01:17:35 0.63448528 0.09133728
 01:18:00 0.89051365 0.09029608
 01:18:17 

Re: [R] Trouble loading package

2010-04-02 Thread Uwe Ligges



On 28.03.2010 21:20, Peter Ehlers wrote:

I haven't seen an answer to this yet.

Your problem may stem from having defined a variable T.
I can replicate your error messages with:

T - hello
library(RMark)

So methinks that this probably indicates that there may be
a problem with using T for TRUE (when will Rusers finally
stop doing that???).

And sure enough, after loading RMark (with no T in my
workspace), I find that the authors of RMark have replaced
base R's .First.lib with their own version which contains
the line:

info - strsplit(library(help = pkgname, character.only = T)$info[[1]],
\\:[ ]+)

Note to RMark authors (and others): get used to using
TRUE and FALSE. The few characters saved by using T/F
are not worth it!



Let me add:
Please note that such code could cannot pass R CMD check.  In other 
words, the authors either ignored the error or never checked their 
package. Such a package could not be shipped through CRAN, for example.

Please use the package checks. They are really useful.

Best,
Uwe Ligges



-Peter Ehlers

On 2010-03-26 15:40, Glenn E Stauffer wrote:

I am trying to load a package called Rmark, but when I run

library(Rmark)

I get the following:


library(RMark)

Error in !character.only : invalid argument type
Error in library(RMark) : .First.lib failed for 'RMark'

When I try to load Rmark from the packages menu, I get:


local({pkg- select.list(sort(.packages(all.available = TRUE)))

+ if(nchar(pkg)) library(pkg, character.only=TRUE)})
Error in !character.only : invalid argument type
Error in library(pkg, character.only = TRUE) :
.First.lib failed for 'RMark'

Any ideas what is causing this error?

My OS is Windows XP, and my R version is R.2.10.1

Thanks,
Glenn

*
Glenn E. Stauffer
Graduate Research Assistant
Department of Ecology
Montana State University
Bozeman, MT 59717
406-994-5677
gestauf...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R package checking error.

2010-04-02 Thread Uwe Ligges
The error message F used instead of FALSE is pretty clear to me ...: 
Use FALSE rather than F in your code.


Uwe Ligges


On 30.03.2010 07:36, Dong H. Oh wrote:

Dear useRs,

I am trying to build my package (nonpareff) which deals with some models of
data envelopment analysis.

The building worked well, but checking complains when it tests examples.

Zipped nonparaeff.Rcheck is attached.

Following is the log.
-
arecibo:tmp arecibo$ R CMD build nonparaeff/
* checking for file 'nonparaeff/DESCRIPTION' ... OK
* preparing 'nonparaeff':
* checking DESCRIPTION meta-information ... OK
* checking whether 'INDEX' is up-to-date ... NO
* use '--force' to overwrite the existing 'INDEX'
* removing junk files
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* building 'nonparaeff_0.5-1.tar.gz'

arecibo:tmp arecibo$ R CMD check nonparaeff_0.5-1.tar.gz
* checking for working pdflatex ... OK
* using log directory '/Users/arecibo/tmp/nonparaeff.Rcheck'
* using R version 2.10.0 (2009-10-26)
* using session charset: UTF-8
* checking for file 'nonparaeff/DESCRIPTION' ... OK
* this is package 'nonparaeff' version '0.5-1'
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking for executable files ... OK
* checking whether package 'nonparaeff' can be installed ... WARNING
Found the following significant warnings:
   Warning: package 'geometry' was built under R version 2.10.1
See '/Users/arecibo/tmp/nonparaeff.Rcheck/00install.out' for details.
* checking package directory ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking for unstated dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... NOTE
Found possibly global 'T' or 'F' in the following function:
   ar.dual.dea
* checking Rd files ... NOTE
prepare_Rd: ar.dual.dea.Rd:51: Dropping empty section \seealso
* checking Rd metadata ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking examples ... ERROR
Running examples in 'nonparaeff-Ex.R' failed.
The error most likely occurred in:


### * ar.dual.dea

flush(stderr()); flush(stdout())

### Name: ar.dual.dea
### Title: Assurance Region Data Envelopment Aanlysis (AR-DEA)
### Aliases: ar.dual.dea
### Keywords: Data Envelopment Analysis

### ** Examples


## AR constraint of 0.25= v2/v1= 1.
library(Hmisc)
library(lpSolve)
ar.dat- data.frame(y = c(1, 1, 1, 1, 1, 1),

+  x1 = c(2, 3, 6, 3, 6, 6),
+  x2 = c(5, 3, 1, 8, 4, 2))

(re-

+ ar.dual.dea(ar.dat, noutput = 1, orientation = 1, rts = 1, ar.l =
+ matrix(c(0, 0, 0.25, -1, -1, 1), nrow = 2, ncol = 3), ar.r = c(0, 0),
+ ar.dir = c(=, =)))
Error in ar.dual.dea(ar.dat, noutput = 1, orientation = 1, rts = 1, ar.l =
matrix(c(0,  :
   F used instead of FALSE
Execution halted
---

Following is sessionInfor()

R  sessionInfo()
R version 2.10.0 (2009-10-26)
x86_64-apple-darwin9.8.0

locale:
[1] C/UTF-8/C/C/C/C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] lmtest_0.9-26 zoo_1.5-8 gdata_2.6.1   lpSolve_5.6.4 xtable_1.5-5
[6] MASS_7.3-3

loaded via a namespace (and not attached):
[1] grid_2.10.0 gtools_2.6.1lattice_0.17-26


Thank you for your time and consideration.

Best regards,
Dong-hyun Oh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding regression lines to each factor on a plot when using ANCOVA

2010-04-02 Thread RICHARD M. HEIBERGER
Michael and others,

Here is my complete ancova example

http://astro.ocis.temple.edu/~rmh/HH/hotdog.pdf


This example, especially in Figure 6, places them in a context of a
Cartesian
product of models with the intercept having two levels and slope having
three levels.

It is based on the ancova chapter from my book

\item Heiberger, Richard M., and Burt Holland (2004).
 {\it Statistical Analysis and Data Display: An Intermediate Course
  with Examples in \Splus, \R, and \SAS},
Springer--Verlag, New York.
\url{http://springeronline.com/0-387-40270-5}.

and in essentially this form was used in my chapter in
Heiberger, Richard M., and Burt Holland (2008).
``Structured Sets of Graphs.''  Chapter III.6, (pp.~415--445) in
{\it Handbook of Computational Statistics on Data
Visualization}, edited by Chun-houh Chen, Antony Unwin, and Wolfgang
H\{a}rdle.
Springer-Verlag, Berlin.

Rich

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] All sub-summands of a vector

2010-04-02 Thread Andy Rominger
Hello,

I'd like to take all possible sub-summands of a vector in the quickest and
most efficient way possible.  By sub-summands I mean for each sub-vector,
take its sum.  Which is to say: if I had the vector

x-1:4

I'd want the sum of x[1], x[2], etc.  And then the sum of x[1:2], x[2:3],
etc.  And then...so on.

The result would be:
1 2 3 4
2 5 7
6 9
10

I can do this with for loops (code below) but for long vectors (10^6
elements) looping takes more time than I'd like.  Any suggestions?

Thanks very much in advance--
Andy


# calculate sums of all sub-vectors...
x - 1:4

sub.vect - vector(list,4)

for(t in 1:4) {
maxi - 4 - t + 1
this.sub - numeric(maxi)
for(i in 1:maxi) {
this.sub[i] - sum(x[i:(i+t-1)])
}
sub.vect[[t]] - this.sub
}

sub.vect

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Selecting the first row based on a factor

2010-04-02 Thread Sam Albers
Hello there,

I have a situation where I would like to select the first row of a
particular factor for a data frame (data example below). So that is, I would
like to select the first entry when the factor1 =A and then the first row
when factor1=B etc. I have thousands of entries so I need some general way
of doing this. I have a minimal example that should illustrate what I am
trying to do. I am using R version 2.9.2, ESS version 5.4 and Ubuntu 9.04.

Thanks so much in advance!

Sam

#Minimal example

x - rnorm(100)
y - rnorm(100)
xy - data.frame(x,y)
xy$factor1 - c(A, B,C,D)
xy$factor2 - c(a,b)
xy - xy[order(xy$factor1),]  #This simply orders the data to look more like
the actual data I am working with

#I am trying to use this approach but I am not sure that I am selecting the
correct row and then the output temp is a total mess.
temp - with(xy, unlist(lapply(split(xy, list(factor1=factor1,
factor2=factor2)), function(x) x[1,])))

   xy   factor1 factor2
10.700042585 -2.481633101   A   a   # I would like to select
this row
51.402677849 -0.691143942   A   a
90.188287765 -1.723823157   A   a
13   0.714946028  0.715361315   A   a
17   0.690177271 -0.112394002   A   a
21   0.333101579 -0.316285321   A   a
25   0.439505793 -3.356415326   A   a
89  -1.001153334 -0.739440288   A   a
93   0.135509539  0.949943380   A   a
97  -1.730936150  0.356133105   A   a
2   -0.399355582 -0.843874548   B   b # Then I would like to
select this row. etc
61.285958969  0.958501988   B   b
10   0.495795836 -0.805012667   B   b
14   0.512486789 -0.968247016   B   b
18  -1.189627025  0.455278250   B   b

-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All sub-summands of a vector

2010-04-02 Thread Jorge Ivan Velez
Hi Andy,

Take a look at the rollapply function in the zoo package.

 require(zoo)
Loading required package: zoo
 x - 1:4
 rollapply(zoo(x), 1, sum)
1 2 3 4
1 2 3 4
 rollapply(zoo(x), 2, sum)
1 2 3
3 5 7
 rollapply(zoo(x), 3, sum)
2 3
6 9
 rollapply(zoo(x), 4, sum)
 2
10

# all at once
sapply(1:4, function(r) rollapply(zoo(x), r, sum))


HTH,
Jorge


On Fri, Apr 2, 2010 at 2:24 PM, Andy Rominger  wrote:

 Hello,

 I'd like to take all possible sub-summands of a vector in the quickest and
 most efficient way possible.  By sub-summands I mean for each sub-vector,
 take its sum.  Which is to say: if I had the vector

 x-1:4

 I'd want the sum of x[1], x[2], etc.  And then the sum of x[1:2], x[2:3],
 etc.  And then...so on.

 The result would be:
 1 2 3 4
 2 5 7
 6 9
 10

 I can do this with for loops (code below) but for long vectors (10^6
 elements) looping takes more time than I'd like.  Any suggestions?

 Thanks very much in advance--
 Andy


 # calculate sums of all sub-vectors...
 x - 1:4

 sub.vect - vector(list,4)

 for(t in 1:4) {
maxi - 4 - t + 1
this.sub - numeric(maxi)
for(i in 1:maxi) {
this.sub[i] - sum(x[i:(i+t-1)])
}
sub.vect[[t]] - this.sub
 }

 sub.vect

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting the first row based on a factor

2010-04-02 Thread Erik Iverson

Hello,

Sam Albers wrote:

Hello there,

I have a situation where I would like to select the first row of a
particular factor for a data frame (data example below). So that is, I would
like to select the first entry when the factor1 =A and then the first row
when factor1=B etc. I have thousands of entries so I need some general way
of doing this. I have a minimal example that should illustrate what I am
trying to do. I am using R version 2.9.2, ESS version 5.4 and Ubuntu 9.04.

Thanks so much in advance!

Sam

#Minimal example

x - rnorm(100)
y - rnorm(100)
xy - data.frame(x,y)
xy$factor1 - c(A, B,C,D)
xy$factor2 - c(a,b)
xy - xy[order(xy$factor1),]  #This simply orders the data to look more like
the actual data I am working with


Does

xy[!duplicated(xy$factor1),]

do what you want?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All sub-summands of a vector

2010-04-02 Thread Gabor Grothendieck
There is also rollmean in the zoo package which might be slightly
faster since its optimized for that operation.
k * rollmean(x, k)
e.g.

 2 * rollmean(1:4, 2)
[1] 3 5 7

will give a rolling sum. runmean in the caTools package is even faster.

On Fri, Apr 2, 2010 at 2:31 PM, Jorge Ivan Velez
jorgeivanve...@gmail.com wrote:
 Hi Andy,

 Take a look at the rollapply function in the zoo package.

 require(zoo)
 Loading required package: zoo
 x - 1:4
 rollapply(zoo(x), 1, sum)
 1 2 3 4
 1 2 3 4
 rollapply(zoo(x), 2, sum)
 1 2 3
 3 5 7
 rollapply(zoo(x), 3, sum)
 2 3
 6 9
 rollapply(zoo(x), 4, sum)
  2
 10

 # all at once
 sapply(1:4, function(r) rollapply(zoo(x), r, sum))


 HTH,
 Jorge


 On Fri, Apr 2, 2010 at 2:24 PM, Andy Rominger  wrote:

 Hello,

 I'd like to take all possible sub-summands of a vector in the quickest and
 most efficient way possible.  By sub-summands I mean for each sub-vector,
 take its sum.  Which is to say: if I had the vector

 x-1:4

 I'd want the sum of x[1], x[2], etc.  And then the sum of x[1:2], x[2:3],
 etc.  And then...so on.

 The result would be:
 1 2 3 4
 2 5 7
 6 9
 10

 I can do this with for loops (code below) but for long vectors (10^6
 elements) looping takes more time than I'd like.  Any suggestions?

 Thanks very much in advance--
 Andy


 # calculate sums of all sub-vectors...
 x - 1:4

 sub.vect - vector(list,4)

 for(t in 1:4) {
    maxi - 4 - t + 1
    this.sub - numeric(maxi)
    for(i in 1:maxi) {
        this.sub[i] - sum(x[i:(i+t-1)])
    }
    sub.vect[[t]] - this.sub
 }

 sub.vect

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] compare two fingerprint images

2010-04-02 Thread Juan Antonio Gil Pascual

Hello
I wanted to compare two fingerprint images. How do you do with R?.
Is there a role for cross-correlation of images?

Thanks

--
=
Juan Antonio Gil Pascual
Prof. Titular de Métodos de Investigación en Educación
correo: j...@edu.uned.es
web: www.uned.es/personal/jgil

U.N.E.D.
Fac. de Educación
Dpto. MIDE I
Pº Senda del Rey,7 desp. 122
28040 MADRID
Tel. 91 398 72 79
Fax. 91 398 72 88


Antes de imprimir este correo piense bien si es necesario hacerlo: El 
medioambiente es cosa de todos

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting the first row based on a factor

2010-04-02 Thread Sam Albers
Thanks!

On Fri, Apr 2, 2010 at 11:35 AM, Erik Iverson er...@ccbr.umn.edu wrote:

 Hello,


 Sam Albers wrote:

 Hello there,

 I have a situation where I would like to select the first row of a
 particular factor for a data frame (data example below). So that is, I
 would
 like to select the first entry when the factor1 =A and then the first row
 when factor1=B etc. I have thousands of entries so I need some general way
 of doing this. I have a minimal example that should illustrate what I am
 trying to do. I am using R version 2.9.2, ESS version 5.4 and Ubuntu 9.04.

 Thanks so much in advance!

 Sam

 #Minimal example

 x - rnorm(100)
 y - rnorm(100)
 xy - data.frame(x,y)
 xy$factor1 - c(A, B,C,D)
 xy$factor2 - c(a,b)
 xy - xy[order(xy$factor1),]  #This simply orders the data to look more
 like
 the actual data I am working with


 Does

 xy[!duplicated(xy$factor1),]


This most definitely works. What a beautifully elegant solution. Thanks!


 do what you want?




-- 
*
Sam Albers
Geography Program
University of Northern British Columbia
 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lineplot.CI in sciplot: option ci.fun can't be changed?

2010-04-02 Thread Manuel Morales
For now, just change fun(x) to median(x) (or whatever) in your ci.fun()
below. 

E.g.
lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun=
function(x) c(mean(x)-2*se(x), mean(x)+2*se(x)))

Otherwise, maybe the list members could help with a solution. An example
that illustrates the problem:

ex.fn - function(x, 
 fun = mean,
 fun2 = function(x) fun(x)+sd(x)) {
 
 list(fun=fun(x), fun2=fun2(x))
}

data - rnorm(10)

ex.fn(data)  #works
ex.fn(data, fun=median)  #works
ex.fn(data, fun2=function(x) fun(x)+3)   #error with fun(x) not found


On Fri, 2010-04-02 at 17:36 +, Tao Shi wrote:
 hi List and Manuel,
 
 I have encounter the following problem with the function lineplot.CI.  I'm 
 running R 2.10.1, sciplot 1.0-7 on Win XP.  It seems like it's a scoping 
 issue, but I couldn't figure it out.
 
 Thanks!
 
 ...Tao
 
 
 
  lineplot.CI(x.factor = dose, response = len, data = ToothGrowth)## fine
  lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, 
  fun=median)  ## fine
  lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun=mean)  
  ## fine
  lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= 
  function(x) c(fun(x)-2*se(x), fun(x)+2*se(x)))  ## failed!
 Error in FUN(X[[1L]], ...) : could not find function fun
 
  debug(lineplot.CI)
  lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= 
  function(x) c(fun(x)-2*se(x), fun(x)+2*se(x)))
 
 
 
 
 Browse[2] 
 debug: mn.data - tapply(response, groups, fun)
 Browse[2] 
 debug: CI.data - tapply(response, groups, ci.fun)
 Browse[2] fun
 function (x) 
 mean(x, na.rm = TRUE)
 environment: 0x07178640
 Browse[2] ci.fun
 function(x) c(fun(x)-2*se(x), fun(x)+2*se(x))
 Browse[2] debug(ci.fun)
 Browse[2] fun
 function (x) 
 mean(x, na.rm = TRUE)
 environment: 0x07178640
 Browse[2] 
 debugging in: FUN(X[[1L]], ...)
 debug: c(fun(x) - 2 * se(x), fun(x) + 2 * se(x))
 Browse[3] 
 Error in FUN(X[[1L]], ...) : could not find function fun
  undebug(lineplot.CI)
  lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= 
  function(x) c(fun(x)-se(x), fun(x)+se(x))) 
 Error in FUN(X[[1L]], ...) : could not find function fun
  lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = 
  function(x) mean(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), 
  fun(x)+se(x))) 
 Error in FUN(X[[1L]], ...) : could not find function fun
  lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = 
  function(x) median(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), 
  fun(x)+se(x))) 
 Error in FUN(X[[1L]], ...) : could not find function fun
 
 
 
 
 
 
 _
 Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
 http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1

-- 
http://mutualism.williams.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sharing levels across multiple factor vectors

2010-04-02 Thread Jeff Brown

Ah, I finally figured it out:  I had asked

 In both of those cases, why is the []  needed? 

It's because when on the left hand side of an assignment, the bracket
operator attempts to preserve the class and dimension of the object it's
subsetting.  (Or at least, that's true when the object is a data frame.)
-- 
View this message in context: 
http://n4.nabble.com/Sharing-levels-across-multiple-factor-vectors-tp1747714p1749502.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lineplot.CI in sciplot: option ci.fun can't be changed?

2010-04-02 Thread Tao Shi

hi List and Manuel,

I have encounter the following problem with the function lineplot.CI.  I'm 
running R 2.10.1, sciplot 1.0-7 on Win XP.  It seems like it's a scoping issue, 
but I couldn't figure it out.

Thanks!

...Tao



 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth)    ## fine
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun=median)  
 ## fine
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun=mean)  
 ## fine
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= 
 function(x) c(fun(x)-2*se(x), fun(x)+2*se(x)))  ## failed!
Error in FUN(X[[1L]], ...) : could not find function fun

 debug(lineplot.CI)
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= 
 function(x) c(fun(x)-2*se(x), fun(x)+2*se(x)))




Browse[2] 
debug: mn.data - tapply(response, groups, fun)
Browse[2] 
debug: CI.data - tapply(response, groups, ci.fun)
Browse[2] fun
function (x) 
mean(x, na.rm = TRUE)
environment: 0x07178640
Browse[2] ci.fun
function(x) c(fun(x)-2*se(x), fun(x)+2*se(x))
Browse[2] debug(ci.fun)
Browse[2] fun
function (x) 
mean(x, na.rm = TRUE)
environment: 0x07178640
Browse[2] 
debugging in: FUN(X[[1L]], ...)
debug: c(fun(x) - 2 * se(x), fun(x) + 2 * se(x))
Browse[3] 
Error in FUN(X[[1L]], ...) : could not find function fun
 undebug(lineplot.CI)
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= 
 function(x) c(fun(x)-se(x), fun(x)+se(x))) 
Error in FUN(X[[1L]], ...) : could not find function fun
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = 
 function(x) mean(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), 
 fun(x)+se(x))) 
Error in FUN(X[[1L]], ...) : could not find function fun
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = 
 function(x) median(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), 
 fun(x)+se(x))) 
Error in FUN(X[[1L]], ...) : could not find function fun





  
_
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.

N:WL:en-US:WM_HMP:042010_1
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] panel data

2010-04-02 Thread Geoffrey Smith
Hello, I have an unbalanced panel data set that looks like:

ID,YEAR,HEIGHT
Tom,2007,65
Tom,2008,66
Mary,2007,45
Mary,2008,50
Harry,2007,62
Harry,2008,62
James,2007,68
Jack,2007,70
Jordan,2008,72

That is, James, Jack, and Jordan are missing a YEAR.

Is there any command that will fill in the missing YEAR such that the end
result will be balanced and look like:

ID,YEAR,HEIGHT
Tom,2007,65
Tom,2008,66
Mary,2007,45
Mary,2008,50
Harry,2007,62
Harry,2008,62
James,2007,68
James,2008,NA
Jack,2007,70
Jack,2008,NA
Jordan,2007,NA
Jordan,2008,72

Thank you.  Geoff

-- 
Geoffrey Smith
Visiting Assistant Professor
Department of Finance
W. P. Carey School of Business
Arizona State University

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All sub-summands of a vector

2010-04-02 Thread Andy Rominger
Great, thanks for your help.  I tried:
x - 1:1
y - lapply(1:1,function(t){t*runmean(x,t,alg=fast,endrule=trim)})

and it worked in about 90 sec.

Thanks again,
Andy


On Fri, Apr 2, 2010 at 3:43 PM, Gabor Grothendieck
ggrothendi...@gmail.comwrote:

 There is also rollmean in the zoo package which might be slightly
 faster since its optimized for that operation.
 k * rollmean(x, k)
 e.g.

  2 * rollmean(1:4, 2)
 [1] 3 5 7

 will give a rolling sum. runmean in the caTools package is even faster.

 On Fri, Apr 2, 2010 at 2:31 PM, Jorge Ivan Velez
 jorgeivanve...@gmail.com wrote:
  Hi Andy,
 
  Take a look at the rollapply function in the zoo package.
 
  require(zoo)
  Loading required package: zoo
  x - 1:4
  rollapply(zoo(x), 1, sum)
  1 2 3 4
  1 2 3 4
  rollapply(zoo(x), 2, sum)
  1 2 3
  3 5 7
  rollapply(zoo(x), 3, sum)
  2 3
  6 9
  rollapply(zoo(x), 4, sum)
   2
  10
 
  # all at once
  sapply(1:4, function(r) rollapply(zoo(x), r, sum))
 
 
  HTH,
  Jorge
 
 
  On Fri, Apr 2, 2010 at 2:24 PM, Andy Rominger  wrote:
 
  Hello,
 
  I'd like to take all possible sub-summands of a vector in the quickest
 and
  most efficient way possible.  By sub-summands I mean for each
 sub-vector,
  take its sum.  Which is to say: if I had the vector
 
  x-1:4
 
  I'd want the sum of x[1], x[2], etc.  And then the sum of x[1:2],
 x[2:3],
  etc.  And then...so on.
 
  The result would be:
  1 2 3 4
  2 5 7
  6 9
  10
 
  I can do this with for loops (code below) but for long vectors (10^6
  elements) looping takes more time than I'd like.  Any suggestions?
 
  Thanks very much in advance--
  Andy
 
 
  # calculate sums of all sub-vectors...
  x - 1:4
 
  sub.vect - vector(list,4)
 
  for(t in 1:4) {
 maxi - 4 - t + 1
 this.sub - numeric(maxi)
 for(i in 1:maxi) {
 this.sub[i] - sum(x[i:(i+t-1)])
 }
 sub.vect[[t]] - this.sub
  }
 
  sub.vect
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mouse-clicking on xy-plot

2010-04-02 Thread Nuno Prista
Hi,

I seem to recall coming across a function that allowed one to mouse-click on an 
xy-plot and obtain x and y coordinates. Can anyone remind me its name?

Thanks,

Nuno

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simple plot of values and error bars: Is there an existing function for this

2010-04-02 Thread John Kane
In an OpenOffice.org forum someone asked if it was possible to plot some raw 
data and then add a line for the confidence interval. 

Example at  http://www.graphpad.com/help/Prism5/scatter%20-%20grouped.png

While it may be possible to do this in OOo's spreadsheet program it looks nasty 
(both to do and the results )

I can do this in R but I'm not  good enough that I can produce a fairly 
seamless function to handle it. This must be a fairly common plot so I was 
wondering if anyone can point me to an existing function for it?

I suspect if I knew a bit more about ggplot2 I could do it there.

Thanks


  __
Get a sneak peak at messages with a han
l/overview2/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mouse-clicking on xy-plot

2010-04-02 Thread Erik Iverson

Are you thinking of ?identify ?

Nuno Prista wrote:

Hi,

I seem to recall coming across a function that allowed one to mouse-click on an 
xy-plot and obtain x and y coordinates. Can anyone remind me its name?

Thanks,

Nuno

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] my mail to post

2010-04-02 Thread nicolas nicolas
nmola...@gmail.com

-- 
Att: Nicolás Molano

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mouse-clicking on xy-plot

2010-04-02 Thread Walmes Zeviani

You can use identify() to obtain coordinates from plotted points but if you
want any coordinates you could use locator():

 plot(1:10)
 loc - locator(n=3)
 str(loc)
List of 2
 $ x: num [1:3] 2.3 5.4 8.29
 $ y: num [1:3] 6.15 8.33 2.6
 points(loc$x, loc$y, col=2)
 

Walmes.

-
..ooo0
...
..()... 0ooo...  Walmes Zeviani
...\..(.(.)... Master in Statistics and Agricultural
Experimentation
\_). )../   walmeszevi...@hotmail.com, Lavras - MG, Brasil

(_/
-- 
View this message in context: 
http://n4.nabble.com/mouse-clicking-on-xy-plot-tp1749562p1749586.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] optim doesnt work with my function

2010-04-02 Thread nmolanogunal

#Hello, i have created this function, but optim doesnt maximize it, just
return the value at the inits

W-function(l){
w-rep(0,dim(D)[1])
for(i in 1:dim(D)[1]){
w[i]-PAitk(D[i,],D[-i,],l)
}
return(prod(w))
}
#D is a matrix with entires in {0,1}, l is a vector which length(l)=
dim(D)[2]
#PAitk is an other function defined as

PAitk-function(y,D,lambda){
o-rep(0,dim(D)[1])
for(i in 1:dim(D)[1]){
o[i]-Aitk(lambda,y,D[i,])
}
return(sum(o)/dim(D)[1])
}
#with the same restriction on l and 

Aitk-function(l,x,y){
prod((l^(1-abs(x-y)))*((1-l)^abs(x-y)))
}
#with the same restriction on l
#i want to maximize W in this way
optim(rep(.75,5),W,method =L-BFGS-B,lower
=rep(0.50001,5),upper=rep(0.,5),control=list(fnscale=-1))

#but as i tell you before it just returns the W´s value at the inits
rep(.75,5) or any you put on it.
#I am grateful for the help that you could offer to me
-- 
View this message in context: 
http://n4.nabble.com/optim-doesnt-work-with-my-function-tp1749591p1749591.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] panel data

2010-04-02 Thread David Winsemius


On Apr 2, 2010, at 3:39 PM, Geoffrey Smith wrote:


Hello, I have an unbalanced panel data set that looks like:

ID,YEAR,HEIGHT
Tom,2007,65
Tom,2008,66
Mary,2007,45
Mary,2008,50
Harry,2007,62
Harry,2008,62
James,2007,68
Jack,2007,70
Jordan,2008,72

That is, James, Jack, and Jordan are missing a YEAR.

Is there any command that will fill in the missing YEAR such that  
the end

result will be balanced and look like:

ID,YEAR,HEIGHT
Tom,2007,65
Tom,2008,66
Mary,2007,45
Mary,2008,50
Harry,2007,62
Harry,2008,62
James,2007,68
James,2008,NA
Jack,2007,70
Jack,2008,NA
Jordan,2007,NA
Jordan,2008,72


It's not one command but it's an approach ...  assumes you have data  
in a dataframe named ftbl:


 fexp - expand.grid(ID=unique(ftbl$ID), YEAR=unique(ftbl$YEAR))
 merge(fexp, ftbl, all=TRUE)

   ID YEAR HEIGHT
1   Harry 2007 62
2   Harry 2008 62
3Jack 2007 70
4Jack 2008 NA
5   James 2007 68
6   James 2008 NA
7  Jordan 2007 NA
8  Jordan 2008 72
9Mary 2007 45
10   Mary 2008 50
11Tom 2007 65
12Tom 2008 66



 --

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] model reparameterization

2010-04-02 Thread casperyc

==
y=c(100,200,300,400,500)
treatment=c(1,2,3,3,4)
block=c(1,1,2,3,3)

summary(lm(y~as.factor(treatment)+as.factor(block)))
==

The aim is to find a model that can estimate
the comparison between treatment 1 with 2
and treatment 3 with 4.

I have tried all the possible ones
===
relevel(as.factor(block),ref=1);relevel(as.factor(treatment),ref=1)
relevel(as.factor(block),ref=1);relevel(as.factor(treatment),ref=2)
relevel(as.factor(block),ref=1);relevel(as.factor(treatment),ref=3)
relevel(as.factor(block),ref=1);relevel(as.factor(treatment),ref=4)

relevel(as.factor(block),ref=2);relevel(as.factor(treatment),ref=1)
relevel(as.factor(block),ref=2);relevel(as.factor(treatment),ref=2)
relevel(as.factor(block),ref=2);relevel(as.factor(treatment),ref=3)
relevel(as.factor(block),ref=2);relevel(as.factor(treatment),ref=4)

relevel(as.factor(block),ref=3);relevel(as.factor(treatment),ref=1)
relevel(as.factor(block),ref=3);relevel(as.factor(treatment),ref=2)
relevel(as.factor(block),ref=3);relevel(as.factor(treatment),ref=3)
relevel(as.factor(block),ref=3);relevel(as.factor(treatment),ref=4)


each followed by a line of

summary(lm(y~as.factor(treatment)+as.factor(block)))

i seem to always get NaNs

Am I doing something wrong there?

What model should I use then?

Thanks!

casper

-- 
View this message in context: 
http://n4.nabble.com/model-reparameterization-tp1749621p1749621.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-DB] How to save a model in DB and retrieve It

2010-04-02 Thread Jeff Ryan
A very simple option, since you're only looking to efficiently store
and retrieve, is something like a key-value store.

There is a new rredis (redis) package on CRAN, as well as the
RBerkeley (Oracle Berkeley DB) package.

RBerkeley is as simple as db_put() and db_get() calls where you
specify a key and serialize/unserialize the object before and after.

Caveat to RBerkeley is that it is only functional on *nix until
someone contributes a Windows version or insight on what I need to do
to make that work (issue is that Berkeley DB can't be compiled easily
using the R version of mingw to compile).  The package code is likely
to work for windows if you can manage to get the db headers/libs
installed with the R toolchain.

HTH
Jeff

On Fri, Apr 2, 2010 at 3:37 AM, Daniele Amberti daniele.ambe...@ors.it wrote:
 I'm wondering how to save an object (models like lm, loess, etc) in a DB to 
 retrieve and use it afterwards, an example:

 wind_ms - abs(rnorm(24*30)*4+8)
 air_kgm3 - rnorm(24*30, 0.1)*0.1 + 1.1
 wind_dg - rnorm(24*30) * 360/7
 ms - c(0:25)
 kw_mm92 - c(0,0,0,20,94,205,391,645,979,1375,1795,2000,2040)
 kw_mm92 - c(kw_mm92, rep(2050, length(ms)-length(kw_mm92)))
 modelspline - splinefun(ms, kw_mm92)
 kw - abs(modelspline(wind_ms) - (wind_dg)*2 + (air_kgm3 - 1.15)*300 + 
 rnorm(length(wind_ms))*10)
 #plot(wind_ms, kw)
 windDat - data.frame(kw, wind_ms, air_kgm3, wind_dg)
 windDat[windDat$wind_ms  3, 'kw'] - 0
 model - loess(kw ~ wind_ms + air_kgm3 + wind_dg, data = windDat, enp.target 
 = 10*5*3) #, span = 0.1)

 modX - serialize(model, connection = NULL, ascii = T)

 Channel - odbcConnect(someSysDSN; UID=aUid; PWD=aPwd)
 sqlQuery(Channel,
 paste(
 INSERT INTO GRT.GeneratorsModels
           ([cGeneratorID]
           ,[tModel]
   VALUES
           (1,,
           paste(', gsub(', '', rawToChar(modX)), ', sep = ''),
           ), sep = ) )
 # Up to this it is working correctly,
 # in DB I have the modX variable
 # Problem arise retrieving data and 64kb limit:
  strQ - 
    SELECT  CONVERT(varchar(max), tModel) AS tModel
    FROM    GRT.GeneratorsModels
    WHERE   (cGeneratorID = 1)
    
 x - sqlQuery(Channel, strQ, stringsAsFactors = F, believeNRows = FALSE)
 x - sqlQuery(Channel, strQ, stringsAsFactors = F, believeNRows = FALSE) 
 #read error



 Above code is working for simplier models that have a shorter representation 
 in variable modX.
 Any advice on how to store and retieve this kind of objects?
 Thanks
 Daniele


 ORS Srl

 Via Agostino Morando 1/3 12060 Roddi (Cn) - Italy
 Tel. +39 0173 620211
 Fax. +39 0173 620299 / +39 0173 433111
 Web Site www.ors.it

 
 Qualsiasi utilizzo non autorizzato del presente messaggio e dei suoi allegati 
 è vietato e potrebbe costituire reato.
 Se lei avesse ricevuto erroneamente questo messaggio, Le saremmo grati se 
 provvedesse alla distruzione dello stesso
 e degli eventuali allegati.
 Opinioni, conclusioni o altre informazioni riportate nella e-mail, che non 
 siano relative alle attività e/o
 alla missione aziendale di O.R.S. Srl si intendono non  attribuibili alla 
 società stessa, né la impegnano in alcun modo.
 ___
 R-sig-DB mailing list -- R Special Interest Group
 r-sig...@stat.math.ethz.ch
 https://stat.ethz.ch/mailman/listinfo/r-sig-db




-- 
Jeffrey Ryan
jeffrey.r...@insightalgo.com

ia: insight algorithmics
www.insightalgo.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding regression lines to each factor on a plot when using ANCOVA

2010-04-02 Thread Peter Ehlers


On 2010-04-02 11:07, Michael Friendly wrote:

This is a nice example; thanks for providing it in this form.  I tried
to trim it down to show fewer groups, but ran into the following errors
that I can't understand:

## keep species 1:6
  dataset - subset(dataset, species  7)
Warning message:
In Ops.factor(species, 7) :  not meaningful for factors


You could use: as.numeric(as.character(species))  7
(I usually keep both a numeric and a factor version of the
variable when I foresee doing something like this. Then
you can just use the numeric version with ''.)



## OK, just subset the rows of dataset to keep species 1:6

  dataset - dataset[1:20,]
  ancova(logBeak ~ logMass * species, data=dataset)
Error in `contrasts-`(`*tmp*`, value = contr.treatment) :
contrasts can be applied only to factors with 2 or more levels


I don't get this error (in R 2.11.0 devel), but the lattice
display doesn't work. I get packet errors which go away after
I make sure that the new 'species' has only 6 levels:

 dataset$species - factor(dataset$species)

I suspect that you may have to do the same.

 -Peter


  ancova(logBeak ~ logMass + species, data=dataset)
Error in `contrasts-`(`*tmp*`, value = contr.treatment) :
contrasts can be applied only to factors with 2 or more levels

-Michael


RICHARD M. HEIBERGER wrote:

## Steve,

## please use the ancova function in the HH package.

install.packages(HH)
library(HH)


## windows.options(record=TRUE)
windows.options(record=TRUE)
# hypothetical data
beak.lgth -
c(2.3,4.2,2.7,3.4,4.2,4.8,1.9,2.2,1.7,2.5,15,16.5,14.7,9.6,8.5,9.1,
9.4,17.7,15.6,14,6.8,8.5,9.4,10.5,10.9,11.2,11.5,19,17.2,18.9,
19.5,19.9,12.6,12.1,12.9,14.1,12.5,15,14.8,4.3,5.7,2.4,3.5,2.9)
mass -
c(45.9,47.1,47.6,17.2,17.9,17.7,44.9,44.8,45.3,44.9,39,39.7,41.2,
84.8,79.2,78.3,82.8,102.8,107.2,104.1,51.7,45.5,50.6,27.5,26.6,
27.5,26.9,25.4,23.7,21.7,22.2,23.8,46.9,51.5,49.4,33.4,33.1,33.2,
34.7,39.3,41.7,40.5,42.7,41.8)
## Make species into a factor
species -
factor(c(1,1,1,2,2,2,3,3,3,3,4,4,4,5,5,5,5,6,6,6,7,7,7,
8,8,8,8,9,9,9,9,9,10,10,10,11,11,11,11,12,12,12,12,12))
## then construct a data.frame with the three variables and the log
transforms
dataset - data.frame(species, beak.lgth, mass,
logBeak=log10(beak.lgth),
logMass=log10(mass))
## default is 7 colors, we need 12
trellis.par.set(superpose.line,
Rows(trellis.par.get(superpose.line), c(1:6, 1:6)))
trellis.par.set(superpose.symbol,
Rows(trellis.par.get(superpose.symbol), c(1:6, 1:6)))

ancova(logBeak ~ logMass * species, data=dataset)
ancova(logBeak ~ logMass + species, data=dataset)
ancova(logBeak ~ logMass, groups=species, data=dataset)
ancova(logBeak ~ species, x=logMass, data=dataset)
bwplot(logBeak ~ species, data=dataset)

## Rich

[[alternative HTML version deleted]]






--
Peter Ehlers
University of Calgary

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extracting 'SOME' values from a linear model.

2010-04-02 Thread HouseBandit

I have a regression model that I have used on some data and to look at its
accuracy compared to some other models I have been extracting the same 10%
of the real data to perform a sum of squares calculation

this is what I have tried but it gives me the '0.9*length(t)' fitted value
and the 'length(t)' fitted value that I want, but it doesnt give me those in
between.

my.lm
my.lm$fitted[c(0.9*length(t), length(t))]


Help please

Thanks
-- 
View this message in context: 
http://n4.nabble.com/Extracting-SOME-values-from-a-linear-model-tp1749643p1749643.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using GIS data in R

2010-04-02 Thread Rolf Turner

On 2/04/2010, at 4:37 AM, Scott Duke-Sylvester wrote:

 I have a simple problem: I need to load a ERSI shapefile of US states
 and check whether or not a set of points are within the boundary of
 these states. I have the shapefile, I have the coordinates but I'm
 having a great deal of difficulty bringing the two together. The
 problem is the various GIS packages for R do not play well with each
 other. sp, shapefiles, maptools, etc all use different data
 structures. Can someone suggest a simple set of commands that will
 work together that will:
 
 1) load the shapefile data.
 2) Allow me to test whether or not a (lng,lat) coordinate pair are
 inside or outside the polygons defined in the shapefile.


You may get some mileage out of looking at Adrian Baddeley's vignette
``Handling shapefiles in the spatstat package'' (available at the
entry for spatstat under contributed extension packages on CRAN).

For item 2) you may find the inside.owin() function in spatstat useful.

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting 'SOME' values from a linear model.

2010-04-02 Thread David Winsemius


On Apr 2, 2010, at 5:14 PM, HouseBandit wrote:



I have a regression model that I have used on some data and to look  
at its
accuracy compared to some other models I have been extracting the  
same 10%

of the real data to perform a sum of squares calculation

this is what I have tried but it gives me the '0.9*length(t)' fitted  
value
and the 'length(t)' fitted value that I want, but it doesnt give me  
those in

between.

my.lm
my.lm$fitted[]


That would index exactly 2 numbers. What was your goal? Try just this  
at your console:


c(0.9*length(t), length(t) )

You may want to look at:

?sample

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting 'SOME' values from a linear model.

2010-04-02 Thread HouseBandit

my goal is to return  the selected fitted values and then perform a sum of
squares calcuation with them. I have looked at 'list' etc but cant return
anything. Its either all of the fitted values or just the first and last of
the sub set that I need.

Cheers
-- 
View this message in context: 
http://n4.nabble.com/Extracting-SOME-values-from-a-linear-model-tp1749643p1749659.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cross-validation for parameter selection (glm/logit)

2010-04-02 Thread Steve Lianoglou
Hi,

On Fri, Apr 2, 2010 at 9:14 AM, Jay josip.2...@gmail.com wrote:
 If my aim is to select a good subset of parameters for my final logit
 model built using glm(). What is the best way to cross-validate the
 results so that they are reliable?

 Let's say that I have a large dataset of 1000's of observations. I
 split this data into two groups, one that I use for training and
 another for validation. First I use the training set to build a model,
 and the the stepAIC() with a Forward-Backward search. BUT, if I base
 my parameter selection purely on this result, I suppose it will be
 somewhat skewed due to the 1-time data split (I use only 1 training
 dataset)

Another approach would be to use penalized regression models.

The glment package has lasso and elasticnet models for both logistic
and normal regression models.

Intuitively: in addition to minimizing (say) the squared loss, the
model has to pay some cost (lambda) for including a non-zero parameter
in your model, which in turn provides sparse models.

You ca use CV to fine tune the value for lambda.

If you're not familiar with these penalized models, the glmnet package
has a few references to get you started.

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] vector length help using prcomp

2010-04-02 Thread JRsalvelinus

Hi

I am doing PCA using prcomp and when I try to get predicted values for the
different PC's the number of data points is always one less than in my
original data set.  This is a problem because it prevents me from doing any
post-hoc analysis due to the fact that my dependent variables are one entry
longer than my PC's.  I have checked for missing data to see if it is
omitting any but it is not.  It seems like it is always omitting the first
data point because the output for the predicted PC values always starts at 2
not 1.  Other than that the results of the analysis make sense and it
appears to be working correctly.

If anyone has any idea why this may be happening I would appreciate some
help.

This is the script I am using.

chemPR1 - prcomp(~ ANC + color + CA + pH + TP + volume + maxdepth +
meandepth + elevation 
+ surface + shoreline + littoral , center = TRUE, scale=TRUE, scores=TRUE,
cor=TRUE)

PC1-(predict(ALSC1)[,1])

Thanks.

Jason
-- 
View this message in context: 
http://n4.nabble.com/vector-length-help-using-prcomp-tp1749669p1749669.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cross-validation for parameter selection (glm/logit)

2010-04-02 Thread Bert Gunter
Inline below: 

Bert Gunter
Genentech Nonclinical Statistics

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Steve Lianoglou
Sent: Friday, April 02, 2010 2:34 PM
To: Jay
Cc: r-help@r-project.org
Subject: Re: [R] Cross-validation for parameter selection (glm/logit)

Hi,

On Fri, Apr 2, 2010 at 9:14 AM, Jay josip.2...@gmail.com wrote:
 If my aim is to select a good subset of parameters for my final logit
 model built using glm(). 

-- Define good


What is the best way to cross-validate the

-- Define best

 results so that they are reliable?

-- Define reliable

Answers depend on what you mean by these terms. I suggest you consult a
statistician to work with you. These are huge issues for which you would
profit by some guidance.

Cheers,
Bert


 Let's say that I have a large dataset of 1000's of observations. I
 split this data into two groups, one that I use for training and
 another for validation. First I use the training set to build a model,
 and the the stepAIC() with a Forward-Backward search. BUT, if I base
 my parameter selection purely on this result, I suppose it will be
 somewhat skewed due to the 1-time data split (I use only 1 training
 dataset)

Another approach would be to use penalized regression models.

The glment package has lasso and elasticnet models for both logistic
and normal regression models.

Intuitively: in addition to minimizing (say) the squared loss, the
model has to pay some cost (lambda) for including a non-zero parameter
in your model, which in turn provides sparse models.

You ca use CV to fine tune the value for lambda.

If you're not familiar with these penalized models, the glmnet package
has a few references to get you started.

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting 'SOME' values from a linear model.

2010-04-02 Thread David Winsemius


On Apr 2, 2010, at 5:32 PM, HouseBandit wrote:



my goal is to return  the selected fitted values ...


Which were never really selected.


... and then perform a sum of
squares calcuation with them. I have looked at 'list' etc but cant  
return
anything. Its either all of the fitted values or just the first and  
last of

the sub set that I need.


A) In the future, don't delete the email train.

B) try this code and see if you can get value out of it:

 vec - 1:100
 vec[(length(vec)*0.9):length(vec)]
 [1]  90  91  92  93  94  95  96  97  98  99 100

Mind you this is just a guess at what you wanted because your origianl  
posting seem unclear as to your goal, at least to my reading.


--
David.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Arellano -- how to generate data for y given t

2010-04-02 Thread forgotmystatslol

See topic
-- 
View this message in context: 
http://n4.nabble.com/Arellano-how-to-generate-data-for-y-given-t-tp1749656p1749656.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lineplot.CI in sciplot: option ci.fun can't be changed?

2010-04-02 Thread Tao Shi

Thanks, Manuel!


 Subject: Re: lineplot.CI in sciplot: option ci.fun can't be changed?
 From: manuel.a.mora...@williams.edu
 To: shi...@hotmail.com
 CC: r-help@r-project.org; mmora...@williams.edu
 Date: Fri, 2 Apr 2010 14:22:33 -0400

 For now, just change fun(x) to median(x) (or whatever) in your ci.fun()
 below.

 E.g.
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun=
 function(x) c(mean(x)-2*se(x), mean(x)+2*se(x)))

 Otherwise, maybe the list members could help with a solution. An example
 that illustrates the problem:

 ex.fn - function(x,
 fun = mean,
 fun2 = function(x) fun(x)+sd(x)) {

 list(fun=fun(x), fun2=fun2(x))
 }

 data - rnorm(10)

 ex.fn(data) #works
 ex.fn(data, fun=median) #works
 ex.fn(data, fun2=function(x) fun(x)+3) #error with fun(x) not found


 On Fri, 2010-04-02 at 17:36 +, Tao Shi wrote:
 hi List and Manuel,

 I have encounter the following problem with the function lineplot.CI. I'm 
 running R 2.10.1, sciplot 1.0-7 on Win XP. It seems like it's a scoping 
 issue, but I couldn't figure it out.

 Thanks!

 ...Tao



 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth) ## fine
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, 
 fun=median) ## fine
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun=mean) 
 ## fine
 lineplot.CI(x.factor = dose, response = len, data = ToothGrow[[elided 
 Hotmail spam]]
 Error in FUN(X[[1L]], ...) : could not find function fun

 debug(lineplot.CI)
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= 
 function(x) c(fun(x)-2*se(x), fun(x)+2*se(x)))
 
 
 
 
 Browse[2]
 debug: mn.data - tapply(response, groups, fun)
 Browse[2]
 debug: CI.data - tapply(response, groups, ci.fun)
 Browse[2] fun
 function (x)
 mean(x, na.rm = TRUE)
 
 Browse[2] ci.fun
 function(x) c(fun(x)-2*se(x), fun(x)+2*se(x))
 Browse[2] debug(ci.fun)
 Browse[2] fun
 function (x)
 mean(x, na.rm = TRUE)
 
 Browse[2]
 debugging in: FUN(X[[1L]], ...)
 debug: c(fun(x) - 2 * se(x), fun(x) + 2 * se(x))
 Browse[3]
 Error in FUN(X[[1L]], ...) : could not find function fun
 undebug(lineplot.CI)
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, ci.fun= 
 function(x) c(fun(x)-se(x), fun(x)+se(x)))
 Error in FUN(X[[1L]], ...) : could not find function fun
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = 
 function(x) mean(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), 
 fun(x)+se(x)))
 Error in FUN(X[[1L]], ...) : could not find function fun
 lineplot.CI(x.factor = dose, response = len, data = ToothGrowth, fun = 
 function(x) median(x, na.rm=TRUE),ci.fun= function(x) c(fun(x)-se(x), 
 fun(x)+se(x)))
 Error in FUN(X[[1L]], ...) : could not find function fun






 _
 Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
 http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1

 --
 http://mutualism.williams.edu

  
_
The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail.

ID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting 'SOME' values from a linear model.

2010-04-02 Thread HouseBandit


David Winsemius wrote:
 
 
 On Apr 2, 2010, at 5:32 PM, HouseBandit wrote:
 

 my goal is to return  the selected fitted values ...
 
 Which were never really selected.
 
 ... and then perform a sum of
 squares calcuation with them. I have looked at 'list' etc but cant  
 return
 anything. Its either all of the fitted values or just the first and  
 last of
 the sub set that I need.
 
 A) In the future, don't delete the email train.
 
 B) try this code and see if you can get value out of it:
 
   vec - 1:100
   vec[(length(vec)*0.9):length(vec)]
   [1]  90  91  92  93  94  95  96  97  98  99 100
 
 Mind you this is just a guess at what you wanted because your origianl  
 posting seem unclear as to your goal, at least to my reading.
 
 -- 
 David.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 

Hi,

I had just tried something similar and got it working.

my.lm
my.lm.fit-my.lm$fitted
my.lm.fit[(0.9*length(t)): length(t)]


Thanks for your quick replies though

Cheers
-- 
View this message in context: 
http://n4.nabble.com/Extracting-SOME-values-from-a-linear-model-tp1749643p1749705.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tetrachoric correlations

2010-04-02 Thread HAKAN DEMIRTAS
Hi,

Is there any R library/package that calculates tetrachoric correlations from 
given marginals and Pearson correlations among ordinal variables? 

Inputs to polychor function in polycor package are either contingency tables or 
ordinal data themselves. I am looking for something that takes marginal 
distributions and Pearson correlation as inputs.

For example, Y1=(1,2,3) with P(Y1=1)=0.3, P(Y1=2)=0.5, P(Y1=3)=0.2 and
Y2=(1,2) with P(Y2=1)=0.6, P(Y2=2)=0.4, and corr(Y1,Y2)=0.5 (Pearson 
correlation among ordinal variables)

How do I calculate the tetrachoric correlation here?

Thanks,
Hakan Demirtas 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] compare multiple values with vector and return vector

2010-04-02 Thread Joris Meys
Dear all,

I have a vector, and for each element I want to check whether it is equal to
any element from another vector. I want a vector of logical values with the
length of the first one as return. In R this would be :

 x - 1:10
 sapply(x,function(y){any(y==c(2,3,4))})
[1] FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE

It works pretty smooth, but I have the feeling there's a less complicated
way of doing it. My code should be readable by programmers who are not
really familiar with R, but I hate to use for-loops as I have pretty huge
datasets. Anybody an idea?
thank you in advance.

Cheers
Joris

-- 
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] compare multiple values with vector and return vector

2010-04-02 Thread Rolf Turner

On 3/04/2010, at 11:35 AM, Joris Meys wrote:

 Dear all,
 
 I have a vector, and for each element I want to check whether it is equal to
 any element from another vector. I want a vector of logical values with the
 length of the first one as return. In R this would be :
 
 x - 1:10
 sapply(x,function(y){any(y==c(2,3,4))})
 [1] FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
 
 It works pretty smooth, but I have the feeling there's a less complicated
 way of doing it. My code should be readable by programmers who are not
 really familiar with R, but I hate to use for-loops as I have pretty huge
 datasets. Anybody an idea?
 thank you in advance.

?%in%

cheers,

Rolf Turner
##
Attention: 
This e-mail message is privileged and confidential. If you are not the 
intended recipient please delete the message and notify the sender. 
Any views or opinions presented are solely those of the author.

This e-mail has been scanned and cleared by MailMarshal 
www.marshalsoftware.com
##

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems with PDF/Latex when building a package

2010-04-02 Thread Erin Hodgess
Dear R People:

I'm building a packages on an Ubuntu Karmic Koala 9.10 system and am
getting the following errors:


* checking PDF version of manual ... WARNING
LaTeX errors when creating PDF version.
This typically indicates Rd problems.
LaTeX errors found:
! Font T1/ptm/m/n/10=ptmr8t at 10.0pt not loadable: Metric (TFM) file not found
.
to be read again
   relax
l.7 \begin{document}
! Font T1/ptm/m/n/24.88=ptmr8t at 24.88pt not loadable: Metric (TFM) file not f
ound.
to be read again
   relax
l.8 \chapter*{}
! Font T1/ptm/bx/n/24.88=ptmb8t at 24.88pt not loadable: Metric (TFM) file not
found.
to be read again
   relax
l.8 \chapter*{}
! Font \T1/ptm/b/n/24.88=nullfont not loadable: Metric (TFM) file not found.
to be read again
   \relax
l.8 \chapter*{}

! Font T1/ptm/bx/n/10=ptmb8t at 10.0pt not loadable: Metric (TFM) file not foun
d.
to be read again
   relax
l.10 {\textbf{\huge Package `RcmdrPlugin.epack'}
! Font \T1/ptm/b/n/10=nullfont not loadable: Metric (TFM) file not found.
to be read again
   \relax
l.10 {\textbf{\huge Package `RcmdrPlugin.epack'}
}
! Font T1/ptm/bx/n/20.74=ptmb8t at 20.74pt not loadable: Metric (TFM) file not
found.
to be read again
   relax
l.10 {\textbf{\huge Package `RcmdrPlugin.epack'}
! Font \T1/ptm/b/n/20.74=nullfont not loadable: Metric (TFM) file not found.
to be read again
   \relax
l.10 {\textbf{\huge Package `RcmdrPlugin.epack'}
}
! Font T1/ptm/m/n/12=ptmr8t at 12.0pt not loadable: Metric (TFM) file not found
.
to be read again
   relax
l.11 \par\bigskip{\large
! Font T1/pcr/m/n/10=pcrr8t at 10.0pt not loadable: Metric (TFM) file not found
.
to be read again
   relax
l.19 ...AsIs{Erin hodgess\email{hodge...@uhd.edu}}
! Font T1/ptm/m/n/14.4=ptmr8t at 14.4pt not loadable: Metric (TFM) file not fou
nd.
to be read again
   relax
l.33 ...r Plug-In}{RcmdrepackPlugin.Rdash.package}
! Font T1/ptm/bx/n/14.4=ptmb8t at 14.4pt not loadable: Metric (TFM) file not fo
und.
to be read again
   relax
l.33 ...r Plug-In}{RcmdrepackPlugin.Rdash.package}
! Font \T1/ptm/b/n/14.4=nullfont not loadable: Metric (TFM) file not found.
to be read again
   \relax
l.33 ...r Plug-In}{RcmdrepackPlugin.Rdash.package}

! Font T1/phv/m/n/14.4=phvr8t at 14.4pt not loadable: Metric (TFM) file not fou
nd.
to be read again
   relax
l.33 ...r Plug-In}{RcmdrepackPlugin.Rdash.package}
! Font T1/ptm/m/it/10=ptmri8t at 10.0pt not loadable: Metric (TFM) file not fou
nd.
to be read again
   relax
l.33 ...r Plug-In}{RcmdrepackPlugin.Rdash.package}
! Font T1/ptm/m/sl/10=ptmro8t at 10.0pt not loadable: Metric (TFM) file not fou
nd.
to be read again
   relax
l.62 \end{document}
! Font T1/phv/m/n/10=phvr8t at 10.0pt not loadable: Metric (TFM) file not found
.
to be read again
   relax
l.62 \end{document}
* checking PDF version of manual without index ... ERROR
e...@ubuntu:~$


When I run on another system, it runs fine.  Has anyone run into this
lately, please?

I got the pdflatex via sudo apt-get install pdflatex.

Thanks,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >