Re: [R] R-help Digest, Vol 123, Issue 30

2013-05-27 Thread Neotropical bat risk assessments
Hi all are there any R packages that include circular stats similar to 
Oriana (http://www.kovcomp.co.uk/oriana/newver4.html)?


I am interested in looking at annual patterns of bat activity where data 
will have date/times and relative abundance values for each Date.


I would like to have a circular plot with the circumference axis the 
12 months of the year and then a value of relative abundance and likely 
with ggplot2 this can be set to color= species.


Tnx

Bruce

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Construct plot combination using grid without plotting and retrieving an object?

2013-05-27 Thread Paul Murrell

Hi

On 05/24/13 20:24, Johannes Graumann wrote:

Hi,

I'm currently combining multiple plots using something along the lines
of the following pseudo-code:

library(grid)
grid.newpage()
tmpLayout - grid.layout(
 nrow=4,
 ncol=2)
pushViewport(viewport(layout = tmpLayout))

and than proceeding with filling the viewports ... works fine, but for
packaging of functions I would really prefer if I could assemble all of
this in an object which in the end would be callable with print.

I'm envisioning something along the lines of what I can do with
ggplot2: return a plot as a ggpplot object and plot it later rather
than as I assemble it. Is that possible with a complex grid figure?

Thanks for any pointers.


You can work off-screen with grobs and gTrees and vpTrees, for example ...

library(grid)
vplay - viewport(layout=grid.layout(2, 2),
  name=vplay)
vp.1.1 - viewport(layout.pos.col=1, layout.pos.row=1,
   name=vp.1.1)
vp.2.2 - viewport(layout.pos.col=2, layout.pos.row=2,
   name=vp.2.2)
x - gTree(childrenvp=vpTree(vplay, vpList(vp.1.1, vp.2.2)),
   children=gList(
   rectGrob(vp=vplay::vp.1.1, gp=gpar(fill=grey)),
   textGrob(1, vp=vplay::vp.1.1),
   rectGrob(vp=vplay::vp.2.2, gp=gpar(fill=grey)),
   textGrob(2, vp=vplay::vp.2.2)))
grid.newpage()
grid.draw(x)

... is that the sort of thing you mean?

Paul


Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
p...@stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3d interactive video using the rgl package

2013-05-27 Thread Xavier Hoenner
Hi Duncan,

Thanks a lot for your response, that was very helpful. I've managed to get my 
head around the javascript code produced by the writeWebGL function: I now have 
a 4d interactive animation that can be played in a web browser. Let me know if 
you're interested in seeing it and I'll send it to you by email.

Thanks again for your kind help.

Regards,

Xavier Hoenner


From: Duncan Murdoch [murdoch.dun...@gmail.com]
Sent: Thursday, 16 May 2013 11:52 PM
To: Xavier Hoenner
Cc: r-help@r-project.org
Subject: Re: [R] 3d interactive video using the rgl package

On 16/05/2013 4:06 AM, Xavier Hoenner wrote:
 Hi all,

 I've been using the 'rgl' package to visualise in 3d the water temperature 
 recorded by a glider deployed off the coast of Australia (see snapshot 
 attached). Using the writeWebGL function, I'm able to produce an html file of 
 the scene with which I can then interact (e.g. zoom in/out, rotate) in my web 
 browser.

 In R, I have created another scene that includes a loop plotting the 
 movements of the glider with the time. Is it possible to export that whole 
 animation with the writeWebGL function? I've only managed to export the scene 
 once all the points of my loop have been plotted, and the movie3d() function 
 is not really a good option for me as I would like to be able to interact 
 with my 3d animation in my web browser.

 Thanks in anticipation for your help.

No, that's not currently supported.  You could probably do it using
Javascript in the web page produced by writeWebGL.  I'm not sure whether
it could be done entirely using the template argument, or whether you'd
need to manually edit the writeWebGL output.

If you put together something like this, please let me know.  I'd like
to see it.

Duncan Murdoch


 Xavier

 

 Dr. Xavier Hoenner

 eMII Project Officer, Integrated Marine Observing System (IMOS)

 University of Tasmania, Sandy Bay campus, Maths Building, room 355, Private 
 Bag 21, Hobart, TAS 7001

 Tel: +61 3 6226 1752tel:%2B61%203%206226%201752; Mob: +61 411 271 
 462tel:%2B61%20411%20271%20462; Fax: +61 3 6226 
 8575tel:%2B61%203%206226%208575

 Email: xavier.hoen...@utas.edu.aumailto:xavier.hoen...@utas.edu.au, URL:  
 http://imos.org.au/emii.html

 


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] configure ddply() to avoid reordering of '.variables'

2013-05-27 Thread Liviu Andronic
Hello,
I'm using ddply() in plyr and I notice that it has the habit of
re-ordering the levels of the '.variables' by which the splitting is
done. I'm concerned about correctly retrieving the original ordering.

Consider:
require(plyr)
x - iris[ order(iris$Species, decreasing=T), ]
head(x)
#Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#101  6.3 3.3  6.0 2.5 virginica
#102  5.8 2.7  5.1 1.9 virginica
#103  7.1 3.0  5.9 2.1 virginica
#104  6.3 2.9  5.6 1.8 virginica
#105  6.5 3.0  5.8 2.2 virginica
#106  7.6 3.0  6.6 2.1 virginica
xa - ddply(x, .(Species), function(x)
{data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length -
mean(x$Sepal.Length)))})
#  
|==|
100%
##notice how the ordering of Species is different
##from that in the input data frame
head(xa)
#  Species Sepal.Length mean.adj
#1  setosa  5.10.094
#2  setosa  4.9   -0.106
#3  setosa  4.7   -0.306
#4  setosa  4.6   -0.406
#5  setosa  5.0   -0.006
#6  setosa  5.40.394
all.equal(xa$Species, x$Species)
#[1] 100 string mismatches
all.equal(xa[ order(xa$Species, decreasing=T), ]$Species, x$Species)
#[1] TRUE
all.equal(xa$Sepal.Length, x$Sepal.Length)
#[1] Mean relative difference: 0.2785
all.equal(xa[ order(xa$Species, decreasing=T), ]$Sepal.Length, x$Sepal.Length)
#[1] TRUE

In my real data, should I be concerned that simply reordering by the
'.variables' variable wouldn't necessarily restore the original
ordering as in the input data frame? Is it possible to instruct
ddply() to avoid re-ordering the supplied '.variables' variable?

Regards,
Liviu


-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Indexing within by statement - different coloured lines in abline wanted..

2013-05-27 Thread Tom Wilding
Dear R-list

I'm trying to get each regression line, plotted using abline, to be of a 
different colour as the following code illustrates.  I'm hoping there is a 
simple indexing solution.  Many thanks.

## code from here
colours=c(black,red,blue,green,pink)
Mean=500;Sd=10;NosSites=5;Xaxis=seq(1,5,1)
SlopeCoefficient=5;Site=(gl(NosSites,length(Xaxis),labels=1:NosSites))
Predictor=rep(Xaxis,NosSites)
InterceptAdjustment=rnorm(n=NosSites,mean=Xaxis,sd=50)
RandomIntercept=rep(InterceptAdjustment,each=length(Xaxis))
PreResponse=rnorm(n=length(Predictor), 
mean=Mean+SlopeCoefficient*1:length(Xaxis),sd=Sd)
Response1=PreResponse+RandomIntercept

#create data frame
Data2=data.frame(Site,Predictor,Mean,SlopeCoefficient,RandomIntercept,Response1)
Data1=data.frame(Site=Data2$Site,Predictor=Data2$Predictor,Response1=Data2$Response1)
#plotting
var=as.numeric(levels(Data1$Site))
par(mfrow=c(1,3))
plot(Response1~Predictor,data=Data1,xlim=c(min(Xaxis),max(Xaxis)),ylim=c(MN,MX),
 pch=as.numeric(Site),main=Raw data with linear regresssions by Site)
by(Data1,Data1$Site,function(Site){
  par(new=T)
  abline(lm(Response1~Predictor,data=Site),col=colours[])#index in here.
})
The Scottish Association for Marine Science (SAMS) is registered in Scotland as 
a Company Limited by Guarantee (SC009292) and is a registered charity (9206). 
SAMS has an actively trading wholly owned subsidiary company: SAMS Research 
Services Ltd a Limited Company (SC224404). All Companies in the group are 
registered in Scotland and share a registered office at Scottish Marine 
Institute, Oban Argyll PA37 1QA. The content of this message may contain 
personal views which are not the views of SAMS unless specifically stated. 
Please note that all email traffic is monitored for purposes of security and 
spam filtering. As such individual emails may be examined in more detail.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Classification of Multivariate Time Series

2013-05-27 Thread Lorenzo Isella
Dear All,
Apologies for not posting a code snippet, but I really need a pointer about
a methodology to look at my data and possibly some R package which can ease
my task.
I am given a set consisting of several multivariate noisy time series,
let's call it {A}.
Each A_i in {A}, in turn, consists of several numerical time series.
Then I have another set of shorter time series {B}.
Now, for every B_j in {B}, I need to determine the time series A_i where
most likely B_j comes from (A_i is not just a subset of B_j).
In other words, I need to determine the distance between A_i and B_j.
I was thinking about the Mahalanobis distance described here.

http://en.wikipedia.org/wiki/Mahalanobis_distance

However, I have several questions in my head
1) With the Mahalanobis distance, do I lose the info about the time
structure of the data? I am not just comparing some distributions, but some
time series and the ordering of the data is important.
2) Even if the use of the Mahalanobis distance was appropriate, it involves
the calculation of a covariance matrix and a mean.
Should I average A_i or B_j (or a subset of B_j having the same length as
A_i)? And should I use a correlation matrix based on A_i or B_j?

Any suggestion is welcome.

Lorenzo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Time Series prediction

2013-05-27 Thread Giovanni Azua
Hello,

I would like to use a parametric TS model and predictor as benchmark to
compare against other ML methods I'm employing. I currently build a simple
e.g. ARIMA model using the convenient auto.arima function like this:

library(forecast)
df - read.table(/Users/bravegag/data/myts.dat)
# btw my test data doesn't have seasonality but periodicity so the value
# 2 is arbitrarily set, using a freq of yearly or 1 would make unhappy some
# R ts functions
tsdata - ts(df$signal, freq=2)
arimamodel - auto.arima(tsdata, max.p=15, max.q=10, stationary=FALSE,
ic=bic, stepwise=TRUE, seasonal=FALSE, parallel=FALSE, num.cores=4,
trace=TRUE, allowdrift=TRUE)
arimapred - forecast.Arima(arimamodel, h=20)
plot(arimapred)

The problem is that the forecast.Arima function is apparently doing a free
run i.e. it uses the forecast(t+1) value as input to compute forecast(t+2)
and I'm instead interested in a prediction mode where it always use the
observed tsdata(t+1) value to predict forecast(t+2), the observed
tsdata(t+2) to predict forecast(t+3) and so on.

Can anyone please advice how to achieve this?

TIA,
Best regards,
Giovanni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Classification of Multivariate Time Series

2013-05-27 Thread Emre Sahin
Did you have a look at Dynamic Time Warping and dtw package?

Best, E. 

On Mon, May 27, 2013 at 01:34:42PM +0200, Lorenzo Isella wrote:
 Dear All,
 Apologies for not posting a code snippet, but I really need a pointer about
 a methodology to look at my data and possibly some R package which can ease
 my task.
 I am given a set consisting of several multivariate noisy time series,
 let's call it {A}.
 Each A_i in {A}, in turn, consists of several numerical time series.
 Then I have another set of shorter time series {B}.
 Now, for every B_j in {B}, I need to determine the time series A_i where
 most likely B_j comes from (A_i is not just a subset of B_j).
 In other words, I need to determine the distance between A_i and B_j.
 I was thinking about the Mahalanobis distance described here.
 
 http://en.wikipedia.org/wiki/Mahalanobis_distance
 
 However, I have several questions in my head
 1) With the Mahalanobis distance, do I lose the info about the time
 structure of the data? I am not just comparing some distributions, but some
 time series and the ordering of the data is important.
 2) Even if the use of the Mahalanobis distance was appropriate, it involves
 the calculation of a covariance matrix and a mean.
 Should I average A_i or B_j (or a subset of B_j having the same length as
 A_i)? And should I use a correlation matrix based on A_i or B_j?
 
 Any suggestion is welcome.
 
 Lorenzo
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Indexing within by statement - different coloured lines in abline wanted..

2013-05-27 Thread John Kane
Slightly diffferent approach but will this do what you want.

library(ggplot2)
  ggplot(Data1, aes(Predictor, Response1, colour = Site)) +
 geom_smooth(method= lm, se = FALSE) +
 ggtitle(Raw data with linear regresssions by Site)

John Kane
Kingston ON Canada


 -Original Message-
 From: tom.wild...@sams.ac.uk
 Sent: Mon, 27 May 2013 10:39:58 +
 To: r-help@r-project.org
 Subject: [R] Indexing within by statement - different coloured lines in
 abline wanted..
 
 Dear R-list
 
 I'm trying to get each regression line, plotted using abline, to be of a
 different colour as the following code illustrates.  I'm hoping there is
 a simple indexing solution.  Many thanks.
 
 ## code from here
 colours=c(black,red,blue,green,pink)
 Mean=500;Sd=10;NosSites=5;Xaxis=seq(1,5,1)
 SlopeCoefficient=5;Site=(gl(NosSites,length(Xaxis),labels=1:NosSites))
 Predictor=rep(Xaxis,NosSites)
 InterceptAdjustment=rnorm(n=NosSites,mean=Xaxis,sd=50)
 RandomIntercept=rep(InterceptAdjustment,each=length(Xaxis))
 PreResponse=rnorm(n=length(Predictor),
 mean=Mean+SlopeCoefficient*1:length(Xaxis),sd=Sd)
 Response1=PreResponse+RandomIntercept
 
 #create data frame
 Data2=data.frame(Site,Predictor,Mean,SlopeCoefficient,RandomIntercept,Response1)
 Data1=data.frame(Site=Data2$Site,Predictor=Data2$Predictor,Response1=Data2$Response1)
 #plotting
 var=as.numeric(levels(Data1$Site))
 par(mfrow=c(1,3))
 plot(Response1~Predictor,data=Data1,xlim=c(min(Xaxis),max(Xaxis)),ylim=c(MN,MX),
  pch=as.numeric(Site),main=Raw data with linear regresssions by
 Site)
 by(Data1,Data1$Site,function(Site){
   par(new=T)
   abline(lm(Response1~Predictor,data=Site),col=colours[])#index in here.
 })
 The Scottish Association for Marine Science (SAMS) is registered in
 Scotland as a Company Limited by Guarantee (SC009292) and is a registered
 charity (9206). SAMS has an actively trading wholly owned subsidiary
 company: SAMS Research Services Ltd a Limited Company (SC224404). All
 Companies in the group are registered in Scotland and share a registered
 office at Scottish Marine Institute, Oban Argyll PA37 1QA. The content of
 this message may contain personal views which are not the views of SAMS
 unless specifically stated. Please note that all email traffic is
 monitored for purposes of security and spam filtering. As such individual
 emails may be examined in more detail.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Share photos  screenshots in seconds...
TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if1
Works in all emails, instant messengers, blogs, forums and social networks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] metaMDS with large dataset produces 'insufficient data' warning

2013-05-27 Thread Raeanne Miller
Greetings everyone,

I am running MDS on a very large dataset (12 x 25071 - 12 model runs with 25071 
output values each), and also on a very much reduced version of the dataset 
(randomly select 1000 of the 25071 output values). I would like to look at 
similarities/dissimilarities between the 12 model runs. When I use metaMDS on 
the full dataset, I get a warning message:

Warning message:
In metaMDS(MDSdata, distance = bray, k = 2, autotransform = FALSE) :
  Stress is (nearly) zero - you may have insufficient data

I don't think I have insufficient data, with 12 x 25071 data points, and when I 
reduce the dataset to only 1000 values per model run (so only 12 x 1000) I 
don't get this warning (though the final stress is now only just below 0.2 - my 
desired value).

Is this warning because I have insufficient data? Or is it because of the 
nature of a large dataset?

I can supply a dataset in .txt format by email, if that would be helpful.

Thanks for your help,

Raeanne

The Scottish Association for Marine Science (SAMS) is registered in Scotland as 
a Company Limited by Guarantee (SC009292) and is a registered charity (9206). 
SAMS has an actively trading wholly owned subsidiary company: SAMS Research 
Services Ltd a Limited Company (SC224404). All Companies in the group are 
registered in Scotland and share a registered office at Scottish Marine 
Institute, Oban Argyll PA37 1QA. The content of this message may contain 
personal views which are not the views of SAMS unless specifically stated. 
Please note that all email traffic is monitored for purposes of security and 
spam filtering. As such individual emails may be examined in more detail.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Indexing within by statement - different coloured lines in abline wanted..

2013-05-27 Thread Blaser Nello
abline(lm(Response1~Predictor,data=Site),col=colours[as.numeric(Site[1,1
])])

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Tom Wilding
Sent: Montag, 27. Mai 2013 12:40
To: r-help@r-project.org
Subject: [R] Indexing within by statement - different coloured lines in
abline wanted..

Dear R-list

I'm trying to get each regression line, plotted using abline, to be of a
different colour as the following code illustrates.  I'm hoping there is
a simple indexing solution.  Many thanks.

## code from here
colours=c(black,red,blue,green,pink)
Mean=500;Sd=10;NosSites=5;Xaxis=seq(1,5,1)
SlopeCoefficient=5;Site=(gl(NosSites,length(Xaxis),labels=1:NosSites))
Predictor=rep(Xaxis,NosSites)
InterceptAdjustment=rnorm(n=NosSites,mean=Xaxis,sd=50)
RandomIntercept=rep(InterceptAdjustment,each=length(Xaxis))
PreResponse=rnorm(n=length(Predictor),
mean=Mean+SlopeCoefficient*1:length(Xaxis),sd=Sd)
Response1=PreResponse+RandomIntercept

#create data frame
Data2=data.frame(Site,Predictor,Mean,SlopeCoefficient,RandomIntercept,Re
sponse1)
Data1=data.frame(Site=Data2$Site,Predictor=Data2$Predictor,Response
1=Data2$Response1)
#plotting
var=as.numeric(levels(Data1$Site))
par(mfrow=c(1,3))
plot(Response1~Predictor,data=Data1,xlim=c(min(Xaxis),max(Xaxis)),ylim=c
(MN,MX),
 pch=as.numeric(Site),main=Raw data with linear regresssions by
Site) by(Data1,Data1$Site,function(Site){
  par(new=T)
  abline(lm(Response1~Predictor,data=Site),col=colours[])#index in here.
})
The Scottish Association for Marine Science (SAMS) is registered in
Scotland as a Company Limited by Guarantee (SC009292) and is a
registered charity (9206). SAMS has an actively trading wholly owned
subsidiary company: SAMS Research Services Ltd a Limited Company
(SC224404). All Companies in the group are registered in Scotland and
share a registered office at Scottish Marine Institute, Oban Argyll PA37
1QA. The content of this message may contain personal views which are
not the views of SAMS unless specifically stated. Please note that all
email traffic is monitored for purposes of security and spam filtering.
As such individual emails may be examined in more detail.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Classification of Multivariate Time Series

2013-05-27 Thread Roy Mendelssohn - NOAA Federal
Look at:

State - Space Discrimination and Clustering of. Atmospheric Time Series Data. 
Based on Kullback Information Measures. Thomas Bengtsson

If you Google the topic, there are  host of other papers too, but the one 
meshes with exiting star-space methods.

-Roy

On May 27, 2013, at 4:34 AM, Lorenzo Isella lorenzo.ise...@gmail.com wrote:

 Dear All,
 Apologies for not posting a code snippet, but I really need a pointer about
 a methodology to look at my data and possibly some R package which can ease
 my task.
 I am given a set consisting of several multivariate noisy time series,
 let's call it {A}.
 Each A_i in {A}, in turn, consists of several numerical time series.
 Then I have another set of shorter time series {B}.
 Now, for every B_j in {B}, I need to determine the time series A_i where
 most likely B_j comes from (A_i is not just a subset of B_j).
 In other words, I need to determine the distance between A_i and B_j.
 I was thinking about the Mahalanobis distance described here.
 
 http://en.wikipedia.org/wiki/Mahalanobis_distance
 
 However, I have several questions in my head
 1) With the Mahalanobis distance, do I lose the info about the time
 structure of the data? I am not just comparing some distributions, but some
 time series and the ordering of the data is important.
 2) Even if the use of the Mahalanobis distance was appropriate, it involves
 the calculation of a covariance matrix and a mean.
 Should I average A_i or B_j (or a subset of B_j having the same length as
 A_i)? And should I use a correlation matrix based on A_i or B_j?
 
 Any suggestion is welcome.
 
 Lorenzo
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

**
The contents of this message do not reflect any position of the U.S. 
Government or NOAA.
**
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
1352 Lighthouse Avenue
Pacific Grove, CA 93950-2097

e-mail: roy.mendelss...@noaa.gov (Note new e-mail address)
voice: (831)-648-9029
fax: (831)-648-8440
www: http://www.pfeg.noaa.gov/

Old age and treachery will overcome youth and skill.
From those who have been given much, much will be expected 
the arc of the moral universe is long, but it bends toward justice -MLK Jr.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] configure ddply() to avoid reordering of '.variables'

2013-05-27 Thread arun
May be this helps 

levels(x$Species)
#[1] setosa versicolor virginica 
x$Species- factor(x$Species,levels=unique(x$Species))
xa - ddply(x, .(Species), function(x)
 {data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length -
 mean(x$Sepal.Length)))})
 head(xa)
#    Species Sepal.Length mean.adj
#1 virginica  6.3   -0.288
#2 virginica  5.8   -0.788
#3 virginica  7.1    0.512
#4 virginica  6.3   -0.288
#5 virginica  6.5   -0.088
#6 virginica  7.6    1.012


A.K.


- Original Message -
From: Liviu Andronic landronim...@gmail.com
To: r-help@r-project.org Help r-help@r-project.org
Cc: 
Sent: Monday, May 27, 2013 4:47 AM
Subject: [R] configure ddply() to avoid reordering of '.variables'

Hello,
I'm using ddply() in plyr and I notice that it has the habit of
re-ordering the levels of the '.variables' by which the splitting is
done. I'm concerned about correctly retrieving the original ordering.

Consider:
require(plyr)
x - iris[ order(iris$Species, decreasing=T), ]
head(x)
#    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#101          6.3         3.3          6.0         2.5 virginica
#102          5.8         2.7          5.1         1.9 virginica
#103          7.1         3.0          5.9         2.1 virginica
#104          6.3         2.9          5.6         1.8 virginica
#105          6.5         3.0          5.8         2.2 virginica
#106          7.6         3.0          6.6         2.1 virginica
xa - ddply(x, .(Species), function(x)
{data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length -
mean(x$Sepal.Length)))})
#  
|==|
100%
##notice how the ordering of Species is different
##from that in the input data frame
head(xa)
#  Species Sepal.Length mean.adj
#1  setosa          5.1    0.094
#2  setosa          4.9   -0.106
#3  setosa          4.7   -0.306
#4  setosa          4.6   -0.406
#5  setosa          5.0   -0.006
#6  setosa          5.4    0.394
all.equal(xa$Species, x$Species)
#[1] 100 string mismatches
all.equal(xa[ order(xa$Species, decreasing=T), ]$Species, x$Species)
#[1] TRUE
all.equal(xa$Sepal.Length, x$Sepal.Length)
#[1] Mean relative difference: 0.2785
all.equal(xa[ order(xa$Species, decreasing=T), ]$Sepal.Length, x$Sepal.Length)
#[1] TRUE

In my real data, should I be concerned that simply reordering by the
'.variables' variable wouldn't necessarily restore the original
ordering as in the input data frame? Is it possible to instruct
ddply() to avoid re-ordering the supplied '.variables' variable?

Regards,
Liviu


-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about subsetting S4 object in ROCR

2013-05-27 Thread Guido Leoni
Dear list
I'm testing a predictor and I produced nice performance plots with ROCR
package utilizing the 3 standard command

pred - prediction(predictions, labels)
perf - performance(pred, measure = tpr, x.measure = fpr)
plot(perf, col=rainbow(10))

The pred object and the perfo object are S4
with the following slots

An object of class performance
Slot x.name:
[1] False positive rate

Slot y.name:
[1] True positive rate

Slot alpha.name:
[1] Cutoff

Slot x.values:
[[1]]
 [1] 0.00 0.00 0.05 0.10 0.10 0.10 0.10 0.10 0.15 0.15 0.15 0.20 0.25 0.25
0.25 0.25 0.25 0.30 0.35 0.35 0.35 0.40 0.40 0.45 0.50 0.50 0.55 0.55 0.60
[30] 0.65 0.65 0.70 0.70 0.75 0.80 0.85 0.90 0.90 0.95 1.00 1.00


Slot y.values:
[[1]]
 [1] 0.00 0.05 0.05 0.05 0.10 0.15 0.20 0.25 0.25 0.30 0.35 0.35 0.35 0.40
0.45 0.50 0.55 0.55 0.55 0.60 0.65 0.65 0.70 0.70 0.70 0.75 0.75 0.80 0.80
[30] 0.80 0.85 0.85 0.90 0.90 0.90 0.90 0.90 0.95 0.95 0.95 1.00


Slot alpha.values:
[[1]]
 [1]   Inf 33309 32968 31688 31648 31355 31122 31047 30777 30589 30460
30395 30305 30159 29841 29101 28734 28657 28393 28196 27740 27662 27373
27078
[25] 26763 26303 25573 25416 25364 25357 24993 23834 23789 23616 22357
20669 20092 18720 18136 17323 16665


Now i'd like to make a plot (and also compute the AUC) only of the area
corresponding to  0.80  y.values and 0.40  x.values.
According to your experience is it possible to subset the perf object to
the afore mentioned values?
 Thanks
Guido

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] configure ddply() to avoid reordering of '.variables'

2013-05-27 Thread arun
Also,
you can check:
http://stackoverflow.com/questions/7235421/how-to-ddply-without-sorting


keeping.order - function(data, fn, ...) { 
  col - .sortColumn
  data[,col] - 1:nrow(data) 
  out - fn(data, ...) 
  if (!col %in% colnames(out)) stop(Ordering column not preserved by 
function) 
  out - out[order(out[,col]),] 
  out[,col] - NULL 
  out 
}
x - iris[ order(iris$Species, decreasing=T), ]
xa- 
ddply(x,.(Species),mutate,mean.adj=Sepal.Length-mean(Sepal.Length))[-c(2:4)]
xa1- 
keeping.order(x,ddply,.(Species),mutate,mean.adj=Sepal.Length-mean(Sepal.Length))[-c(2:4)]
 head(xa1)
#    Sepal.Length   Species mean.adj
#101  6.3 virginica   -0.288
#102  5.8 virginica   -0.788
#103  7.1 virginica    0.512
#104  6.3 virginica   -0.288
#105  6.5 virginica   -0.088
#106  7.6 virginica    1.012
A.K.



- Original Message -
From: arun smartpink...@yahoo.com
To: Liviu Andronic landronim...@gmail.com
Cc: R help r-help@r-project.org
Sent: Monday, May 27, 2013 10:06 AM
Subject: Re: [R] configure ddply() to avoid reordering of '.variables'

May be this helps 

levels(x$Species)
#[1] setosa versicolor virginica 
x$Species- factor(x$Species,levels=unique(x$Species))
xa - ddply(x, .(Species), function(x)
 {data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length -
 mean(x$Sepal.Length)))})
 head(xa)
#    Species Sepal.Length mean.adj
#1 virginica  6.3   -0.288
#2 virginica  5.8   -0.788
#3 virginica  7.1    0.512
#4 virginica  6.3   -0.288
#5 virginica  6.5   -0.088
#6 virginica  7.6    1.012


A.K.


- Original Message -
From: Liviu Andronic landronim...@gmail.com
To: r-help@r-project.org Help r-help@r-project.org
Cc: 
Sent: Monday, May 27, 2013 4:47 AM
Subject: [R] configure ddply() to avoid reordering of '.variables'

Hello,
I'm using ddply() in plyr and I notice that it has the habit of
re-ordering the levels of the '.variables' by which the splitting is
done. I'm concerned about correctly retrieving the original ordering.

Consider:
require(plyr)
x - iris[ order(iris$Species, decreasing=T), ]
head(x)
#    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#101          6.3         3.3          6.0         2.5 virginica
#102          5.8         2.7          5.1         1.9 virginica
#103          7.1         3.0          5.9         2.1 virginica
#104          6.3         2.9          5.6         1.8 virginica
#105          6.5         3.0          5.8         2.2 virginica
#106          7.6         3.0          6.6         2.1 virginica
xa - ddply(x, .(Species), function(x)
{data.frame(Sepal.Length=x$Sepal.Length, mean.adj=(x$Sepal.Length -
mean(x$Sepal.Length)))})
#  
|==|
100%
##notice how the ordering of Species is different
##from that in the input data frame
head(xa)
#  Species Sepal.Length mean.adj
#1  setosa          5.1    0.094
#2  setosa          4.9   -0.106
#3  setosa          4.7   -0.306
#4  setosa          4.6   -0.406
#5  setosa          5.0   -0.006
#6  setosa          5.4    0.394
all.equal(xa$Species, x$Species)
#[1] 100 string mismatches
all.equal(xa[ order(xa$Species, decreasing=T), ]$Species, x$Species)
#[1] TRUE
all.equal(xa$Sepal.Length, x$Sepal.Length)
#[1] Mean relative difference: 0.2785
all.equal(xa[ order(xa$Species, decreasing=T), ]$Sepal.Length, x$Sepal.Length)
#[1] TRUE

In my real data, should I be concerned that simply reordering by the
'.variables' variable wouldn't necessarily restore the original
ordering as in the input data frame? Is it possible to instruct
ddply() to avoid re-ordering the supplied '.variables' variable?

Regards,
Liviu


-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] updating observations in lm

2013-05-27 Thread ivo welch
dear R experts---I would like to update OLS regressions with new
observations on the front of the data, and delete some old
observations from the rear.  my goal is to have a flexible
moving-window regression, with a minimum number of observations and a
maximum number of observations.  I can keep (X' X) and (X' y), and add
or subtract observations from these two quantities myself, and then
use crossprod.

strucchange does recursive residuals, which is closely related, but it
is not designed for such flexible movable windows, nor primarily
designed to produce standard errors of coefficients.

before I get started on this, I just wanted to inquire whether someone
has already written such a function.

regards,

/iaw

Ivo Welch (ivo.we...@gmail.com)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] MLE for probit regression. How to avoid p=1 or p=0

2013-05-27 Thread knouri


Dear all: 

I am writing the following small function for a probit likelihood.
As indicated, in order to avoid p=1 or p=0, I defined some precisions.
I feel however, that there might be a better way to do this.
Any help is greatly appreciated.

##

##set limits to avoid px=0 or px=1
precision1   - 0.99
precision0   - 0.01

logpost - function(par, data){
px    - pnorm(b0 + b1x)
# to avoid px=1 or px=0
px[px   precision1] - precision1
px[px   precision0] - precision0
loga  - sum( y*log(px)+(1-y)*log(1-px) )
loga
}

#



Best,
 


Keramat Nourijelyani, PhD
Associate Professorof Biostatistics
Tehran University of Medical Sciences
http://tums.ac.ir/faculties/nourij

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MLE for probit regression. How to avoid p=1 or p=0

2013-05-27 Thread Rui Barradas

Hello,

You write a function of two arguments, 'par' and 'data' and do not use 
them in the body of the function. Furthermore, what are b0, b1x and y?


Also, take a look at ?.Machine. In particular, couldn't you use

precision0 - .Machine$double.eps
precision1 - 1 - .Machine$double.eps

instead of 0.01 and 0.99?

Hope this helps,

Rui Barradas

Em 27-05-2013 16:21, knouri escreveu:



Dear all:

I am writing the following small function for a probit likelihood.
As indicated, in order to avoid p=1 or p=0, I defined some precisions.
I feel however, that there might be a better way to do this.
Any help is greatly appreciated.

##

##set limits to avoid px=0 or px=1
precision1   - 0.99
precision0   - 0.01

logpost - function(par, data){
px- pnorm(b0 + b1x)
# to avoid px=1 or px=0
px[px   precision1] - precision1
px[px   precision0] - precision0
loga  - sum( y*log(px)+(1-y)*log(1-px) )
loga
}

#



Best,



Keramat Nourijelyani, PhD
Associate Professorof Biostatistics
Tehran University of Medical Sciences
http://tums.ac.ir/faculties/nourij

[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Assistant

2013-05-27 Thread Adelabu Ahmmed
Dear Sir/Ma,

I Adelabu.A.A, one of the R-users from Nigeria. I have a data-set of claims 
paid, premium for individual life-insurance policy holder but not in triangle 
form. how can i running stochastics chainladder in r on it.

please help
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How sum all possible combinations of rows, given 4 matrices

2013-05-27 Thread Estigarribia, Bruno
Hello all,

I have 4 matrices with 3 columns each (different number of rows though). I
want to find a function that returns all possible 3-place vectors
corresponding to the sum by columns of picking one row from matrix 1, one
from matrix 2, one from matrix 3, and one from matrix 4. So basically, all
possible ways of picking one row from each matrix and then sum their
columns to obtain a 3-place vector.
Is there a way to use expand.grid and reduce to obtain this result? Or am
I on the wrong track?
Thank you,
Bruno
PS:I believe I have given all relevant info. I apologize in advance if my
question is ill-posed or ambiguous.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Stop on fail using data manipulation

2013-05-27 Thread Ala' Jaouni
Hello,

I have a data set with test results for multiple devices (rows). I also
have an index (column) that stores the first failing test for each device.
I need to remove the results for all the tests that come after the first
failing test.

Example of a data table:

Device,first_failing_test,test1,test2,test3,test4,test5
1,test2,1,2,3,4,5
2,test4,2,3,4,5,6
3,test1,3,4,5,6,7

New table:

Device,first_failing_test,test1,test2,test3,test4,test5
1,test2,1,2,na,na,na
2,test4,2,3,4,5,na
3,test1,3,4,5,na,na

Ideally I need to pass the first table as an argument to a function and get
back the second table.

Any idea how this can be done in R?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-help Digest, Vol 123, Issue 30

2013-05-27 Thread Jim Lemon

On 05/27/2013 10:28 AM, Neotropical bat risk assessments wrote:

Hi all are there any R packages that include circular stats similar to
Oriana (http://www.kovcomp.co.uk/oriana/newver4.html)?

I am interested in looking at annual patterns of bat activity where data
will have date/times and relative abundance values for each Date.

I would like to have a circular plot with the circumference axis the
12 months of the year and then a value of relative abundance and likely
with ggplot2 this can be set to color= species.

Tnx

Bruce


Hi Bruce,
Here is a possibility:

library(plotrix)
batact-matrix(c(sin(seq(0,1.833*pi,length=12))+2+rnorm(36)/4),
 nrow=3,byrow=TRUE)
batpos-seq(0,1.833*pi,length=12)
radial.plot(batact,batpos,rp.type=ps,main=Bat activity by month,
 line.col=2:4,radial.lim=0:4,label.pos=batpos,labels=month.abb,
 point.symbols=16:18,point.col=2:4,label.prop=1.1,start=pi/2,
 clockwise=TRUE)
legend(-3.5,0.5,paste(Species,1:3),pch=16:18,col=2:4)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] updating observations in lm

2013-05-27 Thread Bert Gunter
Ivo:

1. You should not be fitting linear models as you describe. For why
not and  how they should be fit, consult a suitable text on numerical
methods (e.g. Givens and Hoeting).

2. In R, I suggest using lm() and ?update, feeding update() data
modified as you like. This is, after all, the reason for update().

-- Bert

On Mon, May 27, 2013 at 8:12 AM, ivo welch ivo.we...@anderson.ucla.edu wrote:
 dear R experts---I would like to update OLS regressions with new
 observations on the front of the data, and delete some old
 observations from the rear.  my goal is to have a flexible
 moving-window regression, with a minimum number of observations and a
 maximum number of observations.  I can keep (X' X) and (X' y), and add
 or subtract observations from these two quantities myself, and then
 use crossprod.

 strucchange does recursive residuals, which is closely related, but it
 is not designed for such flexible movable windows, nor primarily
 designed to produce standard errors of coefficients.

 before I get started on this, I just wanted to inquire whether someone
 has already written such a function.

 regards,

 /iaw
 
 Ivo Welch (ivo.we...@gmail.com)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How sum all possible combinations of rows, given 4 matrices

2013-05-27 Thread Bert Gunter
Homework? We don't do homework here.

-- Bert

On Mon, May 27, 2013 at 8:24 AM, Estigarribia, Bruno
estig...@email.unc.edu wrote:
 Hello all,

 I have 4 matrices with 3 columns each (different number of rows though). I
 want to find a function that returns all possible 3-place vectors
 corresponding to the sum by columns of picking one row from matrix 1, one
 from matrix 2, one from matrix 3, and one from matrix 4. So basically, all
 possible ways of picking one row from each matrix and then sum their
 columns to obtain a 3-place vector.
 Is there a way to use expand.grid and reduce to obtain this result? Or am
 I on the wrong track?
 Thank you,
 Bruno
 PS:I believe I have given all relevant info. I apologize in advance if my
 question is ill-posed or ambiguous.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How sum all possible combinations of rows, given 4 matrices

2013-05-27 Thread Jeff Newmiller
I expect the answer to involve manipulating indices. But why do you need to do 
this? This looks suspiciously like homework, and there is a no-homework policy 
on this list (see the Posting Guide).
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Estigarribia, Bruno estig...@email.unc.edu wrote:

Hello all,

I have 4 matrices with 3 columns each (different number of rows
though). I
want to find a function that returns all possible 3-place vectors
corresponding to the sum by columns of picking one row from matrix 1,
one
from matrix 2, one from matrix 3, and one from matrix 4. So basically,
all
possible ways of picking one row from each matrix and then sum their
columns to obtain a 3-place vector.
Is there a way to use expand.grid and reduce to obtain this result? Or
am
I on the wrong track?
Thank you,
Bruno
PS:I believe I have given all relevant info. I apologize in advance if
my
question is ill-posed or ambiguous.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] updating observations in lm

2013-05-27 Thread ivo welch
hi bert---thanks for the answer.

my particular problem is well conditioned [stock returns] and speed is
very important.

about 4 years ago, I asked for speedier alternatives to lm (and you
helped me on this one, too),  and then checked into the speed/accuracy
tradeoff.  http://r.789695.n4.nabble.com/very-fast-OLS-regression-td884832.html
. for the particular problem I had, solve(crossprod(x),crossprod(x,y))
worked reasonably well.  moreover, it is easy to debug, being so
simple.   it was faster than lm() by a factor 5..  (for a more generic
library use, it would be nice to have a warning flag when this
algorithm fails, in which case it would fall back on a more robust
algorithm or at least emit a warning.  I wonder how much it would cost
to check the condition of the matrix before deciding on the
algorithm.)

I looked at update(), but its documentation seems to refer to updating
models, not observations.  even if it did, given the speed of lm(), I
don't think it will be that useful.

regards,

/iaw


Ivo Welch (ivo.we...@gmail.com)

On Mon, May 27, 2013 at 9:26 AM, Bert Gunter gunter.ber...@gene.com wrote:
 Ivo:

 1. You should not be fitting linear models as you describe. For why
 not and  how they should be fit, consult a suitable text on numerical
 methods (e.g. Givens and Hoeting).

 2. In R, I suggest using lm() and ?update, feeding update() data
 modified as you like. This is, after all, the reason for update().

 -- Bert

 On Mon, May 27, 2013 at 8:12 AM, ivo welch ivo.we...@anderson.ucla.edu 
 wrote:
 dear R experts---I would like to update OLS regressions with new
 observations on the front of the data, and delete some old
 observations from the rear.  my goal is to have a flexible
 moving-window regression, with a minimum number of observations and a
 maximum number of observations.  I can keep (X' X) and (X' y), and add
 or subtract observations from these two quantities myself, and then
 use crossprod.

 strucchange does recursive residuals, which is closely related, but it
 is not designed for such flexible movable windows, nor primarily
 designed to produce standard errors of coefficients.

 before I get started on this, I just wanted to inquire whether someone
 has already written such a function.

 regards,

 /iaw
 
 Ivo Welch (ivo.we...@gmail.com)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stop on fail using data manipulation

2013-05-27 Thread arun
I have a doubt about your New table especially the 3rd row:
Since after test1 , the test fails, i guess 4,5 should  be NA
dat1-read.table(text=
Device,first_failing_test,test1,test2,test3,test4,test5
1,test2,1,2,3,4,5
2,test4,2,3,4,5,6
3,test1,3,4,5,6,7
,sep=,,header=TRUE,stringsAsFactors=FALSE)

res-do.call(rbind,lapply(seq_len(nrow(dat1)),function(i) 
{indx-colnames(dat1[i,])[-c(1:2)]%in% dat1[i,2]; indx1- 
indx[-length(indx)];dat1[i,-c(1:2)][as.logical(cumsum(c(FALSE,indx1)))]-NA; 
dat1[i,] }))
 res
#  Device first_failing_test test1 test2 test3 test4 test5
#1  1  test2 1 2    NA    NA    NA
#2  2  test4 2 3 4 5    NA
#3  3  test1 3    NA    NA    NA    NA
A.K.

- Original Message -
From: Ala' Jaouni ajao...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Monday, May 27, 2013 10:40 AM
Subject: [R] Stop on fail using data manipulation

Hello,

I have a data set with test results for multiple devices (rows). I also
have an index (column) that stores the first failing test for each device.
I need to remove the results for all the tests that come after the first
failing test.

Example of a data table:

Device,first_failing_test,test1,test2,test3,test4,test5
1,test2,1,2,3,4,5
2,test4,2,3,4,5,6
3,test1,3,4,5,6,7

New table:

Device,first_failing_test,test1,test2,test3,test4,test5
1,test2,1,2,na,na,na
2,test4,2,3,4,5,na
3,test1,3,4,5,na,na

Ideally I need to pass the first table as an argument to a function and get
back the second table.

Any idea how this can be done in R?

Thanks

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] updating observations in lm

2013-05-27 Thread Bert Gunter
?lm.fit   ## may be useful to you then. Have you tried it?

-- Bert

On Mon, May 27, 2013 at 9:52 AM, ivo welch ivo.we...@gmail.com wrote:
 hi bert---thanks for the answer.

 my particular problem is well conditioned [stock returns] and speed is
 very important.

 about 4 years ago, I asked for speedier alternatives to lm (and you
 helped me on this one, too),  and then checked into the speed/accuracy
 tradeoff.  
 http://r.789695.n4.nabble.com/very-fast-OLS-regression-td884832.html
 . for the particular problem I had, solve(crossprod(x),crossprod(x,y))
 worked reasonably well.  moreover, it is easy to debug, being so
 simple.   it was faster than lm() by a factor 5..  (for a more generic
 library use, it would be nice to have a warning flag when this
 algorithm fails, in which case it would fall back on a more robust
 algorithm or at least emit a warning.  I wonder how much it would cost
 to check the condition of the matrix before deciding on the
 algorithm.)

 I looked at update(), but its documentation seems to refer to updating
 models, not observations.  even if it did, given the speed of lm(), I
 don't think it will be that useful.

 regards,

 /iaw

 
 Ivo Welch (ivo.we...@gmail.com)

 On Mon, May 27, 2013 at 9:26 AM, Bert Gunter gunter.ber...@gene.com wrote:
 Ivo:

 1. You should not be fitting linear models as you describe. For why
 not and  how they should be fit, consult a suitable text on numerical
 methods (e.g. Givens and Hoeting).

 2. In R, I suggest using lm() and ?update, feeding update() data
 modified as you like. This is, after all, the reason for update().

 -- Bert

 On Mon, May 27, 2013 at 8:12 AM, ivo welch ivo.we...@anderson.ucla.edu 
 wrote:
 dear R experts---I would like to update OLS regressions with new
 observations on the front of the data, and delete some old
 observations from the rear.  my goal is to have a flexible
 moving-window regression, with a minimum number of observations and a
 maximum number of observations.  I can keep (X' X) and (X' y), and add
 or subtract observations from these two quantities myself, and then
 use crossprod.

 strucchange does recursive residuals, which is closely related, but it
 is not designed for such flexible movable windows, nor primarily
 designed to produce standard errors of coefficients.

 before I get started on this, I just wanted to inquire whether someone
 has already written such a function.

 regards,

 /iaw
 
 Ivo Welch (ivo.we...@gmail.com)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about subsetting S4 object in ROCR

2013-05-27 Thread Uwe Ligges



On 27.05.2013 16:18, Guido Leoni wrote:

Dear list
I'm testing a predictor and I produced nice performance plots with ROCR
package utilizing the 3 standard command

pred - prediction(predictions, labels)
perf - performance(pred, measure = tpr, x.measure = fpr)
plot(perf, col=rainbow(10))

The pred object and the perfo object are S4
with the following slots

An object of class performance
Slot x.name:
[1] False positive rate

Slot y.name:
[1] True positive rate

Slot alpha.name:
[1] Cutoff

Slot x.values:
[[1]]
  [1] 0.00 0.00 0.05 0.10 0.10 0.10 0.10 0.10 0.15 0.15 0.15 0.20 0.25 0.25
0.25 0.25 0.25 0.30 0.35 0.35 0.35 0.40 0.40 0.45 0.50 0.50 0.55 0.55 0.60
[30] 0.65 0.65 0.70 0.70 0.75 0.80 0.85 0.90 0.90 0.95 1.00 1.00


Slot y.values:
[[1]]
  [1] 0.00 0.05 0.05 0.05 0.10 0.15 0.20 0.25 0.25 0.30 0.35 0.35 0.35 0.40
0.45 0.50 0.55 0.55 0.55 0.60 0.65 0.65 0.70 0.70 0.70 0.75 0.75 0.80 0.80
[30] 0.80 0.85 0.85 0.90 0.90 0.90 0.90 0.90 0.95 0.95 0.95 1.00


Slot alpha.values:
[[1]]
  [1]   Inf 33309 32968 31688 31648 31355 31122 31047 30777 30589 30460
30395 30305 30159 29841 29101 28734 28657 28393 28196 27740 27662 27373
27078
[25] 26763 26303 25573 25416 25364 25357 24993 23834 23789 23616 22357
20669 20092 18720 18136 17323 16665


Now i'd like to make a plot (and also compute the AUC) only of the area
corresponding to  0.80  y.values and 0.40  x.values.
According to your experience is it possible to subset the perf object to
the afore mentioned values?


But x=0.4 and y=0.8 is just a point, so I don't get which plot and area 
you are talking about now?


Best,
UWe Ligges







  Thanks
Guido

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How sum all possible combinations of rows, given 4 matrices

2013-05-27 Thread arun
Hi,
Not sure if this is what you expected:

set.seed(24)
mat1- matrix(sample(1:20,3*4,replace=TRUE),ncol=3)
set.seed(28)
mat2- matrix(sample(1:25,3*6,replace=TRUE),ncol=3)
set.seed(30)
mat3- matrix(sample(1:35,3*8,replace=TRUE),ncol=3)
set.seed(35)
mat4- matrix(sample(1:40,3*10,replace=TRUE),ncol=3)
 
dat1-expand.grid(seq(dim(mat1)[1]),seq(dim(mat2)[1]),seq(dim(mat3)[1]),seq(dim(mat4)[1]))
vec1-paste0(mat,1:4)
matNew-do.call(cbind,lapply(seq_len(ncol(dat1)),function(i) 
get(vec1[i])[dat1[,i],]))
colnames(matNew)- (seq(12)-1)%%3+1
datNew-data.frame(matNew)
res-sapply(split(colnames(datNew),gsub(\\..*,,colnames(datNew))),function(x)
 rowSums(datNew[,x]))

dim(res)
#[1] 1920    3
 head(res)
# X1 X2 X3
#[1,] 46 63 70
#[2,] 45 68 59
#[3,] 55 55 66
#[4,] 51 65 61
#[5,] 48 84 75
#[6,] 47 89 64

A.K.

- Original Message -
From: Estigarribia, Bruno estig...@email.unc.edu
To: r-help@R-project.org r-help@r-project.org
Cc: 
Sent: Monday, May 27, 2013 11:24 AM
Subject: [R] How sum all possible combinations of rows, given 4 matrices

Hello all,

I have 4 matrices with 3 columns each (different number of rows though). I
want to find a function that returns all possible 3-place vectors
corresponding to the sum by columns of picking one row from matrix 1, one
from matrix 2, one from matrix 3, and one from matrix 4. So basically, all
possible ways of picking one row from each matrix and then sum their
columns to obtain a 3-place vector.
Is there a way to use expand.grid and reduce to obtain this result? Or am
I on the wrong track?
Thank you,
Bruno
PS:I believe I have given all relevant info. I apologize in advance if my
question is ill-posed or ambiguous.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How I can rearrange columns in data.frame?

2013-05-27 Thread Kristi Glover
Hi R-User,
I am wondering how I can rearrange columns in a table in R. I do have very big 
data set (4500 columns). I have given an example of the data set.

 dput(dat)
structure(list(preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV15A1b = c(0.57, 
0.62, 0.51, 0.95), preV2032A1b = c(0.4, 0.8, 0.24, 0.34), preV2035A1b = c(0.95, 
0.67, 0.81, 0.8), preV59A1b = c(0.05, 0.57, 0.03, 0.5)), .Names = 
c(preV1001A1b, 
preV15A1b, preV2032A1b, preV2035A1b, preV59A1b), class = data.frame, 
row.names = c(NA, 
-4L))

I wanted to make like this 
 dput(dat1)
structure(list(preV15A1b = c(0.57, 0.62, 0.51, 0.95), preV59A1b = c(0.05, 
0.57, 0.03, 0.5), preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV2032A1b = c(0.4, 
0.8, 0.24, 0.34), preV2035A1b = c(0.95, 0.67, 0.81, 0.8)), .Names = 
c(preV15A1b, 
preV59A1b, preV1001A1b, preV2032A1b, preV2035A1b), class = 
data.frame, row.names = c(NA, 
-4L))
Any suggestions. 
KG
==
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How I can rearrange columns in data.frame?

2013-05-27 Thread Berend Hasselman

On 27-05-2013, at 20:17, Kristi Glover kristi.glo...@hotmail.com wrote:

 Hi R-User,
 I am wondering how I can rearrange columns in a table in R. I do have very 
 big data set (4500 columns). I have given an example of the data set.
 
 dput(dat)
 structure(list(preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV15A1b = c(0.57, 
 0.62, 0.51, 0.95), preV2032A1b = c(0.4, 0.8, 0.24, 0.34), preV2035A1b = 
 c(0.95, 
 0.67, 0.81, 0.8), preV59A1b = c(0.05, 0.57, 0.03, 0.5)), .Names = 
 c(preV1001A1b, 
 preV15A1b, preV2032A1b, preV2035A1b, preV59A1b), class = 
 data.frame, row.names = c(NA, 
 -4L))
 
 I wanted to make like this 
 dput(dat1)
 structure(list(preV15A1b = c(0.57, 0.62, 0.51, 0.95), preV59A1b = c(0.05, 
 0.57, 0.03, 0.5), preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV2032A1b = 
 c(0.4, 
 0.8, 0.24, 0.34), preV2035A1b = c(0.95, 0.67, 0.81, 0.8)), .Names = 
 c(preV15A1b, 
 preV59A1b, preV1001A1b, preV2032A1b, preV2035A1b), class = 
 data.frame, row.names = c(NA, 
 -4L))
 Any suggestions. 
 KG


dat2 - dat[,c(preV15A1b, preV59A1b, preV1001A1b, preV2032A1b, 
preV2035A1b)]
identical(dat1,dat2)   

or something like this:

dat3.cols - match(c(preV15A1b, preV59A1b, preV1001A1b, preV2032A1b, 
preV2035A1b), names(dat))
dat3 - dat[,dat3.cols]
identical(dat3,dat2)

A general solution will depend on the new ordering of your columns.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How I can rearrange columns in data.frame?

2013-05-27 Thread peter dalgaard

On May 27, 2013, at 20:17 , Kristi Glover wrote:

 Hi R-User,
 I am wondering how I can rearrange columns in a table in R. I do have very 
 big data set (4500 columns). I have given an example of the data set.
 
 dput(dat)
 structure(list(preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV15A1b = c(0.57, 
 0.62, 0.51, 0.95), preV2032A1b = c(0.4, 0.8, 0.24, 0.34), preV2035A1b = 
 c(0.95, 
 0.67, 0.81, 0.8), preV59A1b = c(0.05, 0.57, 0.03, 0.5)), .Names = 
 c(preV1001A1b, 
 preV15A1b, preV2032A1b, preV2035A1b, preV59A1b), class = 
 data.frame, row.names = c(NA, 
 -4L))
 
 I wanted to make like this 
 dput(dat1)
 structure(list(preV15A1b = c(0.57, 0.62, 0.51, 0.95), preV59A1b = c(0.05, 
 0.57, 0.03, 0.5), preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV2032A1b = 
 c(0.4, 
 0.8, 0.24, 0.34), preV2035A1b = c(0.95, 0.67, 0.81, 0.8)), .Names = 
 c(preV15A1b, 
 preV59A1b, preV1001A1b, preV2032A1b, preV2035A1b), class = 
 data.frame, row.names = c(NA, 
 -4L))
 Any suggestions. 
 KG

Is there a particular logic to that ordering? Otherwise, the obvious way is

nm - c(preV15A1b, preV59A1b, preV1001A1b, preV2032A1b, preV2035A1b)
dat1 - dat[nm]

Or, maybe you are looking for something like this?

o - order(as.numeric(sub(preV([0-9]*)A1b, \\1, names(dat
(dat1 - dat[o])

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How I can rearrange columns in data.frame?

2013-05-27 Thread arun
Hi,
Try this:
dat2-dat[order(as.numeric(gsub(preV(\\d+).*,\\1,colnames(dat]
 dat2
#  preV15A1b preV59A1b preV1001A1b preV2032A1b preV2035A1b
#1  0.57  0.05    0.59    0.40    0.95
#2  0.62  0.57    0.30    0.80    0.67
#3  0.51  0.03    0.78    0.24    0.81
#4  0.95  0.50    0.43    0.34    0.80


identical(dat1,dat2)
#[1] TRUE
A.K.


Hi R-User, 
I am wondering how I can rearrange columns in a table in R. I do 
have very big data set (4500 columns). I have given an example of the 
data set. 

 dput(dat) 
structure(list(preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV15A1b = c(0.57, 
0.62, 0.51, 0.95), preV2032A1b = c(0.4, 0.8, 0.24, 0.34), preV2035A1b = c(0.95, 
0.67, 0.81, 0.8), preV59A1b = c(0.05, 0.57, 0.03, 0.5)), .Names = 
c(preV1001A1b, 
preV15A1b, preV2032A1b, preV2035A1b, preV59A1b), class = data.frame, 
row.names = c(NA, 
-4L)) 

I wanted to make like this 
 dput(dat1) 
structure(list(preV15A1b = c(0.57, 0.62, 0.51, 0.95), preV59A1b = c(0.05, 
0.57, 0.03, 0.5), preV1001A1b = c(0.59, 0.3, 0.78, 0.43), preV2032A1b = c(0.4, 
0.8, 0.24, 0.34), preV2035A1b = c(0.95, 0.67, 0.81, 0.8)), .Names = 
c(preV15A1b, 
preV59A1b, preV1001A1b, preV2032A1b, preV2035A1b), class = 
data.frame, row.names = c(NA, 
-4L)) 
Any suggestions. 
KG

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] updating observations in lm

2013-05-27 Thread Greg Snow
Look at the biglm package.  It does 2 of the 3 things that you asked for:
 Construct an initial lm fit and add a new block of data to update that
fit.  It does not remove data, but you may be able to look at the code and
figure out a way to modify it to do the final piece.


On Mon, May 27, 2013 at 9:12 AM, ivo welch ivo.we...@anderson.ucla.eduwrote:

 dear R experts---I would like to update OLS regressions with new
 observations on the front of the data, and delete some old
 observations from the rear.  my goal is to have a flexible
 moving-window regression, with a minimum number of observations and a
 maximum number of observations.  I can keep (X' X) and (X' y), and add
 or subtract observations from these two quantities myself, and then
 use crossprod.

 strucchange does recursive residuals, which is closely related, but it
 is not designed for such flexible movable windows, nor primarily
 designed to produce standard errors of coefficients.

 before I get started on this, I just wanted to inquire whether someone
 has already written such a function.

 regards,

 /iaw
 
 Ivo Welch (ivo.we...@gmail.com)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] updating observations in lm

2013-05-27 Thread Roger Koenker
The essential trick here is the Sherman-Morrison-Woodbury formula.  
My quantreg package has a lm.fit.recursive function that implements
a fortran version for adding observations, but like biglm I don't remove
observations at the other end either.

Roger Koenker
rkoen...@illinois.edu




On May 27, 2013, at 2:07 PM, Greg Snow wrote:

 Look at the biglm package.  It does 2 of the 3 things that you asked for:
 Construct an initial lm fit and add a new block of data to update that
 fit.  It does not remove data, but you may be able to look at the code and
 figure out a way to modify it to do the final piece.
 
 
 On Mon, May 27, 2013 at 9:12 AM, ivo welch ivo.we...@anderson.ucla.eduwrote:
 
 dear R experts---I would like to update OLS regressions with new
 observations on the front of the data, and delete some old
 observations from the rear.  my goal is to have a flexible
 moving-window regression, with a minimum number of observations and a
 maximum number of observations.  I can keep (X' X) and (X' y), and add
 or subtract observations from these two quantities myself, and then
 use crossprod.
 
 strucchange does recursive residuals, which is closely related, but it
 is not designed for such flexible movable windows, nor primarily
 designed to produce standard errors of coefficients.
 
 before I get started on this, I just wanted to inquire whether someone
 has already written such a function.
 
 regards,
 
 /iaw
 
 Ivo Welch (ivo.we...@gmail.com)
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] updating observations in lm

2013-05-27 Thread Berend Hasselman

On 27-05-2013, at 17:12, ivo welch ivo.we...@anderson.ucla.edu wrote:

 dear R experts---I would like to update OLS regressions with new
 observations on the front of the data, and delete some old
 observations from the rear.  my goal is to have a flexible
 moving-window regression, with a minimum number of observations and a
 maximum number of observations.  I can keep (X' X) and (X' y), and add
 or subtract observations from these two quantities myself, and then
 use crossprod.
 
 strucchange does recursive residuals, which is closely related, but it
 is not designed for such flexible movable windows, nor primarily
 designed to produce standard errors of coefficients.
 
 before I get started on this, I just wanted to inquire whether someone
 has already written such a function.
 

For regression one would use a QR decomposition.
There is an opensource Fortran library qrupdate 
(http://sourceforge.net/projects/qrupdate/) that can update an unpivoted QR 
decomposition for the case of deleting rows/columns and inserting rows/columns.

It could be used to make an R package, which could be used for doing a moving 
window regression.
Quite a lot of work.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] choose the lines

2013-05-27 Thread arun


Hi,
Try this:
dat1- read.csv(dat7.csv,header=TRUE,stringsAsFactors=FALSE,sep=\t)
dat.bru- dat1[!is.na(dat1$evnmt_brutal),]

fun1- function(dat){    
      lst1- split(dat,dat$patient_id)
    lst2- lapply(lst1,function(x) x[cumsum(x$evnmt_brutal==0)0,])
    lst3- lapply(lst2,function(x) 
x[!(all(x$evnmt_brutal==1)|all(x$evnmt_brutal==0)),])
    lst4-lapply(lst3,function(x) {vect.brutal=c()
                for(line in which(x$evnmt_brutal==1)){
                   if(x$evnmt_brutal[line-1]==0){
                  vect.brutal=c(vect.brutal,line)
                    }
                       }
              vect.brutal1- sort(c(vect.brutal,vect.brutal-1))
             x[vect.brutal1,]
                       }
                       )
    res- do.call(rbind,lst4)
    row.names(res)- 1:nrow(res)
    res
    }


fun1(dat.bru)head(fun1(dat.bru),10)
#    X patient_id number responsed_at  t basdai_d evnmt_brutal
#1  14  2 13   2011-08-07 13    0.900    0
#2  15  2 14   2011-09-11 14   -0.800    1
#3  22  3  2   2010-06-29  1   -0.800    0
#4  23  3  3   2010-08-05  2    0.000    1
#5  24  3  4   2010-09-05  3    1.200    0
#6  25  3  5   2010-10-13  4    1.925    1
#7  26  3  6   2010-11-15  5   -2.525    0
#8  27  3  7   2010-12-18  6   -0.200    1
#9  53  5  9   2011-02-13  8    0.000    0
#10 54  5 10   2011-03-19  9   -1.200    1


A.K.
___
From: GUANGUAN LUO guanguan...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Monday, May 27, 2013 8:48 AM
Subject: choose the lines



Hello, Arun,

in this data, i want to choose every line with the variable  evnmt_brutal==1 
 the precedent line( line-1) with evnmt_brutal==0, 
i had done this, 

res.bru - dat7[!is.na(dat7$evnmt_brutal),]
vect.brutal=c()
for(line in which(res.bru$evnmt_brutal==1)){
  if(res.r$evnmt_brutal[line-1]==0){
    vect.brutal=c(vect.brutal,line)}
}
vect.brutal

but now i think it's not correct. Because if there are the situations just like 
this
Patient_id  evnmt_brutal
1  ...
1  ...
1  0
2  1
2  ...
2  ...

I would have chosen the lines of two different patients, so that is not correct.
Do you know how can i change a little and get the correct lines just for each 
patient?

Thank you so much.

GG

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] updating observations in lm

2013-05-27 Thread ivo welch
Gentlemans as 274 algorithm allows weights, so adding an obs with a weight
of -1 would do the trick of removing obs, too.

This may be a good job for hadwell wickhams c code interface.
 On May 27, 2013 12:47 PM, Berend Hasselman b...@xs4all.nl wrote:


 On 27-05-2013, at 17:12, ivo welch ivo.we...@anderson.ucla.edu wrote:

  dear R experts---I would like to update OLS regressions with new
  observations on the front of the data, and delete some old
  observations from the rear.  my goal is to have a flexible
  moving-window regression, with a minimum number of observations and a
  maximum number of observations.  I can keep (X' X) and (X' y), and add
  or subtract observations from these two quantities myself, and then
  use crossprod.
 
  strucchange does recursive residuals, which is closely related, but it
  is not designed for such flexible movable windows, nor primarily
  designed to produce standard errors of coefficients.
 
  before I get started on this, I just wanted to inquire whether someone
  has already written such a function.
 

 For regression one would use a QR decomposition.
 There is an opensource Fortran library qrupdate (
 http://sourceforge.net/projects/qrupdate/) that can update an unpivoted
 QR decomposition for the case of deleting rows/columns and inserting
 rows/columns.

 It could be used to make an R package, which could be used for doing a
 moving window regression.
 Quite a lot of work.

 Berend




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] updating observations in lm

2013-05-27 Thread Berend Hasselman

On 27-05-2013, at 21:57, ivo welch ivo.we...@gmail.com wrote:

 
 Gentlemans as 274 algorithm allows weights, so adding an obs with a weight of 
 -1 would do the trick of removing obs, too.
 
 This may be a good job for hadwell wickhams c code interface.

Searching for Gentlemans as 274 algorithm with google turned up this:

http://jblevins.org/mirror/amiller/

where there is a fortran code for am updated AS 274 algorithm.
I can't judge whether this is suitable for deleting observations.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bayes Logit and Cholesky Decomposition

2013-05-27 Thread Tjun Kiat Teo
I am trying to use the package Bayes Logit and I keep getting this error
message.

 chol2inv(chol(P1.j)) :
  error in evaluating the argument 'x' in selecting a method for function
'chol2inv': Error in chol.default(P1.j) :
  the leading minor of order 5 is not positive definite

I can't see why this would be so because the prior variance matrix that I
feed in is a diagonal matrix so  it is definitely positive definite.

Tjun Kiat

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assistant

2013-05-27 Thread Jim Lemon

On 05/28/2013 12:22 AM, Adelabu Ahmmed wrote:

Dear Sir/Ma,

I Adelabu.A.A, one of the R-users from Nigeria. I have a data-set of claims 
paid, premium for individual life-insurance policy holder but not in triangle 
form. how can i running stochastics chainladder in r on it.

please help
[[alternative HTML version deleted]]


Hi Ahmmed,
This is a very specific question. You might find answers by contacting 
an actuarial forum, e.g.


http://www.actuary.com/actuarial-discussion-forum/
http://www.actuarialoutpost.com/actuarial_discussion_forum/

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data reshaping

2013-05-27 Thread Christofer Bogaso
Hello again, let say I have following data-frame:

 Dat - data.frame(c(rep(c(A, B), each = 4), C, C, C),
c(rep(1:4, 2), 1, 2, 3), 11:21)
 colnames(Dat) - c(X1, X2, X3)
 Dat
   X1 X2 X3
1   A  1 11
2   A  2 12
3   A  3 13
4   A  4 14
5   B  1 15
6   B  2 16
7   B  3 17
8   B  4 18
9   C  1 19
10  C  2 20
11  C  3 21


Now I want to put that data-frame in the following form:

 Dat1 - rbind(c(11,12,13,14), c(15,16,17,18), c(19,20,21, NA));
colnames(Dat1) - c(1,2,3,4); rownames(Dat1) - c(A, B, C)
 Dat1
   1  2  3  4
A 11 12 13 14
B 15 16 17 18
C 19 20 21 NA


Basically, 'Dat' is the melted form of 'Dat1'

Can somebody point me any R function for doing that?

Thanks for your help.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data reshaping

2013-05-27 Thread arun
res1- xtabs(X3~X1+X2,data=Dat)
res1
#   X2
#X1   1  2  3  4
 # A 11 12 13 14
 # B 15 16 17 18
 # C 19 20 21  0
library(reshape2)
 dcast(Dat,X1~X2,value.var=X3)
#  X1  1  2  3  4
#1  A 11 12 13 14
#2  B 15 16 17 18
#3  C 19 20 21 NA
A.K.


Hello again, let say I have following data-frame: 

 Dat - data.frame(c(rep(c(A, B), each = 4), C, C, C), 
c(rep(1:4, 2), 1, 2, 3), 11:21) 
 colnames(Dat) - c(X1, X2, X3) 
 Dat 
   X1 X2 X3 
1   A  1 11 
2   A  2 12 
3   A  3 13 
4   A  4 14 
5   B  1 15 
6   B  2 16 
7   B  3 17 
8   B  4 18 
9   C  1 19 
10  C  2 20 
11  C  3 21 


Now I want to put that data-frame in the following form: 

 Dat1 - rbind(c(11,12,13,14), c(15,16,17,18), c(19,20,21, NA)); 
colnames(Dat1) - c(1,2,3,4); rownames(Dat1) - c(A, B, C) 
 Dat1 
   1  2  3  4 
A 11 12 13 14 
B 15 16 17 18 
C 19 20 21 NA 


Basically, 'Dat' is the melted form of 'Dat1' 

Can somebody point me any R function for doing that? 

Thanks for your help. 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot histograms in a loop

2013-05-27 Thread arun
Hi,

Try either:
set.seed(28)
stats1- as.data.frame(matrix(rnorm(5*1),ncol=5))

pdf(paste(test,1,.pdf,sep=))
par(mfrow=c(2,1))
lst1- lapply(names(stats1),function(i) 
{hist(stats1[,i],100,col=lightblue,main=paste0(Histogram of ,i),xlab=i 
);qqnorm(stats1[,i])})
dev.off()

#or

pdf(paste(test1,1,.pdf,sep=))
par(mfrow=c(2,1))
for(colName in names(stats1)){
hist(stats1[,colName],100,col=lightblue,xlab=colName,main=paste0(Histogram 
of ,colName))
qqnorm(stats1[,colName])
}
dev.off()

A.K.


I have a dataset with more than 50 columns, and I need to check 
distribution for each variable. The idea was to plot histograms and qq 
plots for each of them and check if distribution is normal. I tried 
something like this: 

for(colName in names(stats)){ 
    pdf(paste(test,1,.pdf,sep=)) 
    hist(stats$get(colName)) 100, col=lightblue) 
    qqnorm(stats$get(colName))     
} 
dev.off() 

but that doesn't work. It would be great if I could also manage 
to store all of them in one file, what I think this code should do... 

Thanks,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] fitting grid-based models

2013-05-27 Thread Javier Rodríguez Pérez
Hello!

I'm interested to fit parameters (to data) in a grid-based (individual)
model. If I understood well, simecol library has the fitOdeModel function
but it is only suited to odeModels (differential equation). Alternatively,
FME package has several functions able to perform this procedure but all
examples are for differential eq. models. It is mentioned in the that such
functions could also be used for other kind of models. Could thus be used
for grid-based ones? If that, could be used even the model does not follow
the simecol syntax?

Thanks for your help!
Javier

-- 
##
Javier Rodríguez Pérez

Dep. Biología de Organismos y Sistemas
Unidad Mixta de Investigación en Biodiversidad
Universidad de Oviedo
Valentin Andrés Álvarez s/n, Oviedo 33006, Spain

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data reshaping

2013-05-27 Thread Duncan Mackay

library(reshape2)

dcast(Dat, X1 ~X2, value.var = X3)
  X1  1  2  3  4
1  A 11 12 13 14
2  B 15 16 17 18
3  C 19 20 21 NA

or  use ? reshape

HTH

Duncan

Duncan Mackay
Department of Agronomy and Soil Science
University of New England
Armidale NSW 2351
Email: home: mac...@northnet.com.au



At 10:37 28/05/2013, you wrote:

Hello again, let say I have following data-frame:

 Dat - data.frame(c(rep(c(A, B), each = 4), C, C, C),
c(rep(1:4, 2), 1, 2, 3), 11:21)
 colnames(Dat) - c(X1, X2, X3)
 Dat
   X1 X2 X3
1   A  1 11
2   A  2 12
3   A  3 13
4   A  4 14
5   B  1 15
6   B  2 16
7   B  3 17
8   B  4 18
9   C  1 19
10  C  2 20
11  C  3 21


Now I want to put that data-frame in the following form:

 Dat1 - rbind(c(11,12,13,14), c(15,16,17,18), c(19,20,21, NA));
colnames(Dat1) - c(1,2,3,4); rownames(Dat1) - c(A, B, C)
 Dat1
   1  2  3  4
A 11 12 13 14
B 15 16 17 18
C 19 20 21 NA


Basically, 'Dat' is the melted form of 'Dat1'

Can somebody point me any R function for doing that?

Thanks for your help.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] p values of plor

2013-05-27 Thread meng
Hi all:
As to the polr {MASS} function, how to find out p values of every parameter?


From the example of R help:
house.plr - polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
summary(house.plr)


How to find out the p values of house.plr?




Many thanks.
Best.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] p values of plor

2013-05-27 Thread David Winsemius


On May 27, 2013, at 7:59 PM, meng wrote:


Hi all:
As to the polr {MASS} function, how to find out p values of every  
parameter?




From the example of R help:
house.plr - polr(Sat ~ Infl + Type + Cont, weights = Freq, data =  
housing)

summary(house.plr)


How to find out the p values of house.plr?


Getting  p-values from t-statistics should be fairly straight-forward:

summary(house.plr)$coefficients

--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.