date:20090715

Re: [R] How do I know where is R? - VB Programming

2009-07-15 Thread Peter Dalgaard


Duncan Murdoch wrote:

On 14/07/2009 7:22 PM, Haoda Fu wrote:

Dear all -

Is there anyone know how to let VB or C# know where I install R 
automatically(i.e. auto detect R directory)?


On Windows if you run the installer it will record its location in the 
registry, under *\Software\R-core\R\, where * is HKLM or HKCU, depending 
on what permissions the user had who installed it.  But some users don't 
run the installer (maybe because someone else installed it on a network, 
or they built it themselves) so that isn't completely reliable.


On Unix/Linux we have

$ R RHOME
/usr/lib/R

but same thing doesn't work with Rterm on Windows AFAICS. I wonder if it 
shouldn't?



--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Correlation question (from a newbie)

2009-07-15 Thread Santosh

Dear R-users please ignore my most recent posting..

Found the solution.. Thanks to David Winsemius..


Thanks,
Santosh

On Tue, Jul 14, 2009 at 9:14 PM, Santosh santosh2...@gmail.com wrote:

 Dear R-users..

 I hope the following scenario is more explanatory of my question..

 Continuous variables: AGE, WEIGHT, HEIGHT
 Categorical variables: Group, Sex, Race
 I would like to find a correlation between WEIGHT and AGE, grouped by
 Group,Sex, and Race.

 Is the following formula correct?
 tapply(dat$WEIGHT,
 by=list(dat$AGE,as.factor(dat$Group),as.factor(dat$EX),as.factor(dat$RACE)),cor)


 Thanks,
 Santosh


 On Tue, Jul 14, 2009 at 7:34 PM, Santosh santosh2...@gmail.com wrote:

 Hi R-users,

 Was wondering if there is a way to quickly compute correlations between
 continuous variables grouped by some categorical variables?

 What function do I use?

 Thanks much in advance.

 Regards,
 Santosh




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (simple) xml into data.frame and reverse

2009-07-15 Thread Dieter Menne




Duncan Temple Lang wrote:
 
 
 
 I wrote some relatively general functions, but hastily written functions
 to read this sort of data. You can find them attached or at
 
http://www.omegahat.org/RSXML/xmlToDataFrame.xml
 
 

Looks like that's the wrong link. I also did not find it mentioned in the
parent page.

Dieter


-- 
View this message in context: 
http://www.nabble.com/%28simple%29-xml-into-data.frame-and-reverse-tp24488291p24492792.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Grouping data in dataframe

2009-07-15 Thread Dieter Menne




Timo Schneider wrote:
 
 
 I have a dataframe (obtained from read.table()) which looks like
 
  ExpA   ExpB   ExpC   Size
 1  12 2333  1
 2  12 2429  1
 3  10 2234  1
 4  25 5060  2
 5  24 5362  2
 6  21 4961  2
 
 now I want to take all rows that have the same value in the Size
 column and apply a function to the columns of these rows (for example
 median()). The result should be a new dataframe with the medians of the
 groups, like this:
 
 

Besides the mentioned aggregate, you could use one of the functions in
package plyr.

Dieter

-- 
View this message in context: 
http://www.nabble.com/Grouping-data-in-dataframe-tp24491539p24492807.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SOS! error in GLM logistic regression...

2009-07-15 Thread Gavin Simpson

On Tue, 2009-07-14 at 15:57 -0700, Michael wrote:
 Hi all,
 
 Could anybody tell me what happened to my logistic regression in R?
 mylog=glm(mytraindata$V1 ~ ., data=mytraindata, family=binomial(logit))
 
 It generated the following error message:
 
 Error in model.frame.default(Terms, newdata, na.action = na.action,
 xlev = object$xlevels) :
   factor 'state1' has new level(s) AP

Hi Michael,

I am 99.9% certain that what you claim above is completely false. That
error looks to arise from a call in predict.lm which is called when you
use predict.glm

The big give away for me was 'newdata' in the call that produced the
error. That suggested to me you were using a predict method. I then
tracked down a potential source of the error by looking at what happens
when you call predict on a glm object. Members of the list shouldn't
have to do this to help answer posts to the list. The posting guide asks
you to provide, minimal, self-contained, reproducible examples. And
failing that, the R code you actually used.

After the chastisement ;-) some help:

1) I'm guessing, but in the dataset that you predict for, does the
state1 variable have more/different levels to the training data? Try
levels(mytraindata) and levels(mytestdata) where mytestdata is the data
you supplied as 'newdata' in your call to predict.

2) Your glm call contains some redundancy. It can be simplified to:

mylog - glm(V1 ~ ., data = mytraindata, family = binomial)

Realising that the logit link is the default in the binomial family, and
if you specify data you don't need to refer to the data object in the
formula.

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] The greatest common divisor between more than two integers

2009-07-15 Thread Atte Tenkanen

Hi,

Do somebody know if there is a function in R which  computes the greatest 
common divisor between several (more than two) integers?

Best,

Atte

Atte Tenkanen
University of Turku, Finland
Department of Musicology
+35823335278
http://users.utu.fi/attenka/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The greatest common divisor between more than two integers

2009-07-15 Thread Michael Knudsen

On Wed, Jul 15, 2009 at 8:55 AM, Atte Tenkanenatte...@utu.fi wrote:

 Do somebody know if there is a function in R which  computes the greatest 
 common divisor between several (more than two) integers?

Is there a function for computing the greatest common divisor of *two*
numbers? I can't find one, but assume that there is such a function,
and call it gcd. Then you could define a recursive function to do the
job. Something like

new_gcd = function(v)
{
   if (length(v)==2) return(gcd(v))
   else return (new_gcd(v[1],new_gcd(v[2:length(v)]))
}

where v is a vector containing the numbers you want to calculate the
greatest common divisor of.

-- 
Michael Knudsen
micknud...@gmail.com
http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to deploy statistical models built in R in real-time?

2009-07-15 Thread Allan Engelhardt


 I am framing this as a question since I would like to know how folks
 are currently deploying the models they build in R. Say, you want to
 use the results of your model inside another application in real-time
 or on-demand, how do you do it? How do you use the decisions you get
 back from your models?

Late answer, sorry.  I love PMML (and have been advocating it since at 
least version 2.0) but I rarely see it deployed in commercial 
companies.  What I see in decreasing order of importance:

1. Pre-scoring.  That is pre-calculate the scores of your model for each 
customer and stuff them into a database that your operational system can 
access.  Example: Customer churn in mobile telco.

2. Convert the model to SQL.  This is obviously easier for some model 
types (trees, k-nearest neighbour, ...) than others.  This is 
surprisingly common.  Example: A Big Famous Data Insights Company 
created a global customer segmentation model (really: 'cause all markets 
and cultures are the same) for a multi-national company and 
distributed it as a Word document with pseudo-SQL fragments for each 
country to implement.  Gets over the problem of different technologies 
in different countries.

3. Pre-scoring for multiple likely events.  Example: For cross- and 
up-sell in a call centre (which is phenomenally effective) you really 
want to include the outcome of the original call as an input to the 
propensity model.  A badly handled complaint call does not offer the 
same opportunities for flogging more products as a successful upgrade to 
a higher price plan (but might be an opportunity to extend an 
(expensive) retention offer).  The Right Way to do this is to run the 
model in real time which would usually mean PMML if you have created the 
model in R.  At least one vendor recommended âjustâ pre-scoring the 
model for each possible (relevant) call outcome and storing that in the 
operational database.  That vendor also sold databases :-)

4. Use PL/R to embed R within your (PostgreSQL) RDBMS.  (Rare.)

5. Embed R within your operational application and run the model that 
way (I have done this exactly once).

Somewhere between 1 and 2 is an approach that doesnât really fit with 
the way you framed the question (and is probably OT for this list).  It 
is simply this: if you want to deploy models for real-time or fast 
on-demand usage, usually you donât implement them in R (or SAS or 
SPSS).  In Marketing, which is my main area, there are dedicated tools 
for real-time decisioning and marketing like Oracle RTD [1], Unica 
inbound marketing [2], Chordiant Recommendation Advisor, and others [3], 
though only the first of these can realistically be described as 
âmodellingâ.


Happy to discuss this more offline if you want.  And I really like your 
approach - hope to actually use it some day.


Allan.
More at http://www.pcagroup.co.uk/ and http://www.cybaea.net/Blogs/Data/


[1] 
http://www.oracle.com/appserver/business-intelligence/real-time-decisions.html
[2] http://www.unica.com/products/Inbound_Marketing.htm [web site down 
at time of writing]
[3] E.piphany and SPSS Interaction Builder appears to be nearly dead in 
the market.


On 08/07/09 23:38, Guazzelli, Alex wrote:
 I am framing this as a question since I would like to know how folks
 are currently deploying the models they build in R. Say, you want to
 use the results of your model inside another application in real-time
 or on-demand, how do you do it? How do you use the decisions you get
 back from your models?

 As you may know, a PMML package is available for R that allows for
 many mining model to be exported into the Predictive Model Markup
 Language. PMML is the standard way to represent models and can be
 exported from most statistical packages (including SPSS, SAS,
 KNIME, ...). Once your model is represented as a PMML file, it can
 easily be moved around. PMML allows for true interoperability. We have
 recently published an article about PMML on The R Journal. It
 basically describes the PMML language and the package itself. If you
 are interested in finding out more about PMML and how to benefit from
 this standard, please check the link below.

 http://journal.r-project.org/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf

 We have also wrote a paper about open standards and cloud computing
 for the SIGKDD Explorations newsletter. In this paper, we describe the
 ADAPA Scoring Engine which executes PMML models and is available as a
 service on the Amazon Cloud. ADAPA can be used to deploy R models in
 real-time from anywhere in the world. I believe it represents a
 revolution in data mining since it allows for anyone that uses R to
 make effective use of predictive models at a cost of less than $1/hour.

 http://www.zementis.com/docs/SIGKDD_ADAPA.pdf

 Thanks!

 Alex






   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list

[R] Starting RCmdr from the Commandline (and a shortcut) in Windows

2009-07-15 Thread Scott Hyde

I thought I'd share the experience I had in getting Rcmdr to start
automatically when R starts up.   I'd read of solutions that would start
Rcmdr every time R started up, but I didn't want it to start every time,
only when I wanted it to.

First, I created a shortcut on the desktop to R and changed its name to
Rcmdr.  Next, I changed the target of the shortcut to (edit the version
number if needed):

C:\Program Files\R\R-2.9.1\bin\Rgui.exe --sdi RCMDR=TRUE

I added the options --sdi (I like this interface the best) and created the
RCMDR=TRUE environmental variable.

Next, I edited the Rprofile.site file in the etc directory of the R
installation and added the following lines to the bottom:

defpack = getOption(defaultPackages)
mylibs = c(lattice,MASS,Matrix)
if(Sys.getenv(RCMDR) == TRUE) mylibs = c(mylibs,Rcmdr)
options(defaultPackages = c(defpack,mylibs))

Note that mylibs is the list of libraries that I wanted autoloaded.
Although Matrix will autoload lattice, I didn't want it to announce it every
time R started, so I loaded it first.

Now clicking on Rcmdr on the desktop starts up Rcmdr (in addition to R), and
starting up R regularly does not start Rcmdr, which is what I wanted!

I hope someone else may find this useful!

-Scott

*
Scott K. Hyde
Assistant Professor of Statistics and Mathematics
College of Math and Sciences
Brigham Young University -- Hawaii
Laie, HI  96762

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I know where is R? - VB Programming

2009-07-15 Thread Gabor Grothendieck

This works on Linux (at least under bash shell) and also on Windows
Vista and presumably earlier versions of Windows as well:

R --vanilla --slave -e R.home()

but the R part at the beginning does have the problem that it
only works if R is on your path.

If, on Windows, you normally double click the R icon on your
desktop or use the R utilities in the batchfiles distribution
there would be no need to have R on your path.  The batchfiles
utilities use the registry, and failing that heuristics, to locate R.

On Wed, Jul 15, 2009 at 2:13 AM, Peter Dalgaardp.dalga...@biostat.ku.dk wrote:
 Duncan Murdoch wrote:

 On 14/07/2009 7:22 PM, Haoda Fu wrote:

 Dear all -

 Is there anyone know how to let VB or C# know where I install R
 automatically(i.e. auto detect R directory)?

 On Windows if you run the installer it will record its location in the
 registry, under *\Software\R-core\R\, where * is HKLM or HKCU, depending on
 what permissions the user had who installed it.  But some users don't run
 the installer (maybe because someone else installed it on a network, or they
 built it themselves) so that isn't completely reliable.

 On Unix/Linux we have

 $ R RHOME
 /usr/lib/R

 but same thing doesn't work with Rterm on Windows AFAICS. I wonder if it
 shouldn't?


 --
   O__   Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
 ~~ - (p.dalga...@biostat.ku.dk)              FAX: (+45) 35327907

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] loading multiple .Rdata and preserving variable names

2009-07-15 Thread Žroutík

Dear R-users,

I need to load outputs from multiple previous calculations into one R
session for comparison and (cross-)analysis. The previous calculations are
stored in different directories in .Rdata files (I don't know if this is the
best storage for later usage, but the previous project could be recalculated
and saved into different format, if needed, potentially.)

I can load consequently multiple .Rdata files (in which variable names are
identical), but I don't know how to preserve variables from these files or
make them unique.

I would imagine pointing them as in a list:

# loading 2, max. 4 outputs of previous calculations
load(DataSet1) # VariableA is present
load(DataSet2) # VariableA is present, too

# both VraiableA listed and present
DataSet1$VariableA$parameters
DataSet2$VariableA$parameters

But what is the way to feed all variables into a list? Or more generally,
what is an efficient way to work with multiple separate outputs which one
would like to compare?

(I have read something about environments, but I understood it's only for
functions. I could create environments, but was not succesful in using them
at all (location change or separate sandbox). Functions saveCache,
loadCache, saveObject, loadObject is not really what I have in mind, too --
saveObject(list=ls(), NewObjectFile) is not a solution either...)

Thanks for any hint in advance.

Cheers, Zroutik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Averaging dataframes that are stored in a list

2009-07-15 Thread baptiste auguie

Hi,

you could try the reshape package,

l = list(d1= data.frame(ID=letters[1:4],value=1:4),
  d2= data.frame(ID=letters[1:4],value= -c(1:4)))

library(reshape)
m = melt(l)

cast(m, .~ID, fun=mean)

HTH,

baptiste

2009/7/15 Mark Na mtb...@gmail.com

 Dear R-helpers,
 I have a list containing 5000 elements, each element is a dataframe
 containing one ID column (identical over the 5000 dataframes) and 9 numeric
 variables, e.g.

 ID VAR1 VAR2 VAR3 ... VAR9

 I would like to create a new dataframe containing the ID column and the
 mean
 values of the 9 numeric variables. So, the structure of this new dataframe
 would be identical to the structure of the dataframes stored in my list
 (and
 the ID column would also be identical) but the values would be mean values.

 I've been attempting to do this with rowMeans and subscripting the list
 using double square brackets, but I can't get it to work.

 I'd appreciate any pointers to get me going, thanks!

 Mark Na

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

_

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag
__

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] loading multiple .Rdata and preserving variable names

2009-07-15 Thread Duncan Murdoch


On 15/07/2009 5:00 AM, Žroutík wrote:

Dear R-users,

I need to load outputs from multiple previous calculations into one R
session for comparison and (cross-)analysis. The previous calculations are
stored in different directories in .Rdata files (I don't know if this is the
best storage for later usage, but the previous project could be recalculated
and saved into different format, if needed, potentially.)

I can load consequently multiple .Rdata files (in which variable names are
identical), but I don't know how to preserve variables from these files or
make them unique.

I would imagine pointing them as in a list:

# loading 2, max. 4 outputs of previous calculations
load(DataSet1) # VariableA is present
load(DataSet2) # VariableA is present, too

# both VraiableA listed and present
DataSet1$VariableA$parameters
DataSet2$VariableA$parameters

But what is the way to feed all variables into a list? Or more generally,
what is an efficient way to work with multiple separate outputs which one
would like to compare?

(I have read something about environments, but I understood it's only for
functions. I could create environments, but was not succesful in using them
at all (location change or separate sandbox). Functions saveCache,
loadCache, saveObject, loadObject is not really what I have in mind, too --
saveObject(list=ls(), NewObjectFile) is not a solution either...)


Environments are basic in R.  If they are only for functions, it's 
because everything in R is done by functions.


Given two .RData files, you can load them into separate environments. 
Dealing with conflicting variables names is another issue, of course.


For example:

DataSet1 - new.env()
load( somefile, envir=DataSet1)

Now use ls(DataSet1) to see what you've got, repeat for DataSet2, etc. 
You can get a variable as DataSet1$VariableA.


Duncan Murdoch



Thanks for any hint in advance.

Cheers, Zroutik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] loading multiple .Rdata and preserving variable names

2009-07-15 Thread Barry Rowlingson

On Wed, Jul 15, 2009 at 10:41 AM, Duncan Murdochmurd...@stats.uwo.ca wrote:

 Given two .RData files, you can load them into separate environments.
 Dealing with conflicting variables names is another issue, of course.

 For example:

 DataSet1 - new.env()
 load( somefile, envir=DataSet1)

 Now use ls(DataSet1) to see what you've got, repeat for DataSet2, etc. You
 can get a variable as DataSet1$VariableA.

 As an alternative to loading you can attach the RData and get
variables using get.

 Create a couple of RData files with an 'x' in them:

  x=1
  save(x,file=x1.RData)
  x=2
  save(x,file=x2.RData)
  q()

 Now in a new R session:

  attach(x1.RData,warn=FALSE)
  attach(x2.RData,warn=FALSE)
  get(x,pos=file:x1.RData)
 [1] 1
  get(x,pos=file:x2.RData)
 [1] 2

 I've used warn=FALSE to suppress the warnings that attach() gives you
when objects attached mask other objects of the same name.

 You can also detach() these things once you're done with them.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (newbie) sum for certain number of rows

2009-07-15 Thread kelvin lau


I have following data in a data.csv file separated by space

0 0 1 0 0 1 0 1
0 0 0 0 0 0 0 0
1 0 0 1 1 0 1 0
0 0 1 1 0 0 0 0
1 1 0 0 0 0 1 1
0 0 0 0 0 0 0 0
0 1 0 1 1 0 1 0
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
etc...

I wish to calculate the sum of each column for certain number of rows. For 
example if I want sum of the data after each 3 rows, it should display
1 0 1 1 1 1 1 1
1 1 1 1 0 0 1 1
2 3 2 3 3 2 3 2

So far, this is what I have done
xx-read.table(data.csv,header=FALSE)
ss-t(apply(xx,2,sum)) # which displayed the sum of all rows

I tried my best to look for solution on the Internet but so far haven't managed 
to find it. I am extremely grateful if someone can point me how to go about it. 
Thanks.

Kelvin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problems in resampling time series, block length, trend.test

2009-07-15 Thread Simone Santoro


Hi,

I have a time series (say x) of 13 years showing an evident increase. I want 
to exclude two observations (the fourth and 10th), so I do:

 trend.test(x[-c(4,10)])

where:

 x[-c(4,10)]
 [1]7   37   79   72  197  385  636  705  700 1500 1900

and I get:

Spearman's rank correlation rho

data:  x[-c(4, 10)] and time(x[-c(4, 10)]) 
S = 4, p-value  2.2e-16
alternative hypothesis: true rho is not equal to 0 
sample estimates:
  rho 
0.9818182 

Very strong positive correlation! the x is increasing

Now, I would like to resample this time series because I have others time 
series where the trend is more uncertain and the sample size is still small.
I read Resampling Methods in R: The boot Package (ISSN 1609-3631) and I 
believe that a way of doing it is by a block bootstrap that allows to deal with 
the autocorrelation (I know that some of my time series show autocorrelation, 
lag=1).

I used trend.test function (library=pastecs) and I did:

trend.x=trend.test(x[-c(4,10)],R=999)
trend.x
trend.x$p.value
plot(trend.x)
 
And I get:

 trend.x=trend.test(x[-c(4,10)],R=999)
 trend.x

BLOCK BOOTSTRAP FOR TIME SERIES

Fixed Block Length of 1 

Call:
tsboot(tseries = x, statistic = test.trend, R = R, l = 1, sim = fixed)


Bootstrap Statistics :
 original biasstd. error
t1* 0.9818182 -0.9844001   0.3272114
 trend.x$p.value
[1] 0

I suppose that the problem arises from the length of the block (1) and in this 
way I get a rho=0, (nevertheless I don't understand how it is significant).
I would like to change the block length but I am not able to (I tried in 
several different ways but unsuccessfully).
So, two questions:
1) How can I change the block length?
2) In terms of block length and type of simulation (sim=fixed or geom), 
what is the best way of doing it? 


Thanks in advance for any suggestion,
Best wishes

Simone

_
[[elided Hotmail spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (newbie) sum for certain number of rows

2009-07-15 Thread Dimitris Rizopoulos


one way is the following:

dat - read.table(textConnection(
0 0 1 0 0 1 0 1
0 0 0 0 0 0 0 0
1 0 0 1 1 0 1 0
0 0 1 1 0 0 0 0
1 1 0 0 0 0 1 1
0 0 0 0 0 0 0 0
0 1 0 1 1 0 1 0
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
))
closeAllConnections()

k - 3
ind - rep(seq(1, nrow(dat)/k), each = k)
rowsum(dat, ind)


I hope it helps.

Best,
Dimitris


kelvin lau wrote:

I have following data in a data.csv file separated by space

0 0 1 0 0 1 0 1
0 0 0 0 0 0 0 0
1 0 0 1 1 0 1 0
0 0 1 1 0 0 0 0
1 1 0 0 0 0 1 1
0 0 0 0 0 0 0 0
0 1 0 1 1 0 1 0
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
etc...

I wish to calculate the sum of each column for certain number of rows. For 
example if I want sum of the data after each 3 rows, it should display
1 0 1 1 1 1 1 1
1 1 1 1 0 0 1 1
2 3 2 3 3 2 3 2

So far, this is what I have done
xx-read.table(data.csv,header=FALSE)
ss-t(apply(xx,2,sum)) # which displayed the sum of all rows

I tried my best to look for solution on the Internet but so far haven't managed 
to find it. I am extremely grateful if someone can point me how to go about it. 
Thanks.

Kelvin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] interacting variables in a formulae in R

2009-07-15 Thread Peter Schmidtke

Dear R Mailing Subscribers,

I just have a question of how R handles interacting variables in model
creation using glm for instance. 

if I write : 


model=glm(solution~descriptor1+descriptor2+descriptor3,family=binomial(link=logit))

I should end up with coefficients for a logistic model that I can introduce
easily in the mathematical form of such a model (weighted sum of these
descriptors). 

But how do I have to interpret what R does with interacting terms, like
this? :


model=glm(solution~descriptor1:descriptor2+descriptor3,family=binomial(link=logit))

What does R with the numerical values of descriptor1 and 2 in order to fit
the data? How I can rewrite the model in a mathematical form?

Thanks in advance.

Best regards.
-- 

Peter Schmidtke

--
PhD Student at the Molecular Modeling and Bioinformatics Group
Dep. Physical Chemistry
Faculty of Pharmacy
University of Barcelona

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting confidence intervals

2009-07-15 Thread Jim Lemon


Erin Hodgess wrote:

Hi R People:

If I have a fitted values from a model, how do I plot the
(1-alpha)100% confidence intervals along with the fitted values,
please?

Also, if the intervals are shaded gray, that would be nice too, please?

I check confint, but that doesn't seem to do what I want.

  

Hi Erin,
Probably the simplest way is to plot points with error bars as in the 
following example:


coefficients-rnorm(5)+4
plot(coefficients,main=Coefficient 
plot,xaxt=n,ylim=c(0,6),xlab=Coefficients)

axis(1,at=1:5,labels=paste(Coef,1:5,sep=))
dispersion(1:5,coefficients,abs(rnorm(5)),col=lightgray)

The dispersion function in the plotrix package is just one way to 
illustrate confidence intervals.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2: geom_errorbarh()

2009-07-15 Thread Benoit Boulinguiez

Hi,

I expect 'width' to set the width of the horizontal line of the whisker, as
it does with errobar()

Regards/Cordialement


Benoit Boulinguiez 


-Message d'origine-
De : hadley wickham [mailto:h.wick...@gmail.com] 
Envoyé : dimanche 12 juillet 2009 11:04
À : Benoit Boulinguiez
Cc : r-help@r-project.org
Objet : Re: [R] ggplot2: geom_errorbarh()

Hi Benoit,

What do you expect width to do?  You are already setting the left and right
extents with xmin and xmax.

Hadley

On Thu, Jul 9, 2009 at 10:37 AM, Benoit
Boulinguiezbenoit.boulingu...@ensc-rennes.fr wrote:
 Hi all,

 quick question: is the optional command width effective in the
 geom_errorbarh() layer of ggplot?
 Cause I can't get it works on this graph 
 http://www.4shared.com/file/116919103/93488d88/iso_2PrsH.html


 pdf(file = iso_2PrsH.pdf, width = 7, height = 7) 
 NC60.iso.graph-ggplot(
  NC60.DATA
  ,aes(Ce,Qe)) +
  geom_point(col=MaCouleur1, size=4) +

  geom_errorbar(
  aes(ymax = NC60.DATA$Qe+NC60.DATA$sdQe
   ,ymin=NC60.DATA$Qe-NC60.DATA$sdQe)
   ,colour=alpha(black,0.4)
   ,width=1) +

  geom_errorbarh(
  aes(xmax = NC60.DATA$Ce+NC60.DATA$sdCe
   ,xmin=NC60.DATA$Ce-NC60.DATA$sdCe)
   ,colour=alpha(black,0.4)
   ,width=1) +

  geom_line(data=NC60.Res4.curve
  ,aes(x,y)
  ,size=1
  ,colour=alpha(black,0.5)) +
  xlab(C[e]~(mmol/m^3)) +
  ylab(q[e]~(mmol/m^3))

 print(NC60.iso.graph)
 dev.off()

 Regards/Cordialement

 -
 Benoit Boulinguiez
 Ph.D student
 Ecole de Chimie de Rennes (ENSCR) Bureau 1.20 Equipe CIP UMR CNRS 6226 
 Sciences Chimiques de Rennes
 Avenue du Général Leclerc
 CS 50837
 35708 Rennes CEDEX 7
 Tel 33 (0)2 23 23 80 83
 Fax 33 (0)2 23 23 81 20
  http://www.ensc-rennes.fr/ http://www.ensc-rennes.fr/



        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





--
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] end of daylight saving time

2009-07-15 Thread Denis Chabot


Hi,

I read from the help on DateTimeClasses and various posts on this list  
that, quite logically, one needs to specify if DST is in function for  
the first hour  after the change from DST to ST in autumn.


Hence, in my time zone and on Mac OS X, I can do this:

a - as.POSIXct(2008-11-02 01:30:00, tz=EST5EDT)  # to get  
automatic use of DST
b - as.POSIXct(2008-11-02 01:30:00, tz=EST)  # to tell T this is  
the second occurrence of 1:30 that day, in ST

difftime(b,a)

But why can't I do this, to handle several date-times at once?

c - rep(2008-11-02 01:30:00, 2)
tzone = c(EST5EDT, EST)

as.POSIXct(c, tz=tzone)
Erreur dans strptime(xx, f - %Y-%m-%d %H:%M:%OS, tz = tz) :
  valeur 'tz' incorrecte

???

Thanks,

Denis Chabot

sessionInfo()
R version 2.9.1 Patched (2009-07-09 r48929)
x86_64-apple-darwin9.7.0

locale:
fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.9.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (newbie) sum for certain number of rows

2009-07-15 Thread Gabor Grothendieck

In the zoo package, rollapply can do that.

 library(zoo)
 out - rollapply(ts(dat), width = 3, by = 3, sum)
 out
Time Series:
Start = 3
End = 9
Frequency = 0.333
  V1 V2 V3 V4 V5 V6 V7 V8
3  1  0  1  1  1  1  1  1
6  1  1  1  1  0  0  1  1
9  2  3  2  3  3  2  3  2

This gives a ts class time series result so if you need a
data frame result just convert it back:
   out - as.data.frame(out)

See ?rollapply for more.


On Wed, Jul 15, 2009 at 6:03 AM, kelvin laukelvin...@yahoo.com wrote:

 I have following data in a data.csv file separated by space

 0 0 1 0 0 1 0 1
 0 0 0 0 0 0 0 0
 1 0 0 1 1 0 1 0
 0 0 1 1 0 0 0 0
 1 1 0 0 0 0 1 1
 0 0 0 0 0 0 0 0
 0 1 0 1 1 0 1 0
 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1
 etc...

 I wish to calculate the sum of each column for certain number of rows. For 
 example if I want sum of the data after each 3 rows, it should display
 1 0 1 1 1 1 1 1
 1 1 1 1 0 0 1 1
 2 3 2 3 3 2 3 2

 So far, this is what I have done
 xx-read.table(data.csv,header=FALSE)
 ss-t(apply(xx,2,sum)) # which displayed the sum of all rows

 I tried my best to look for solution on the Internet but so far haven't 
 managed to find it. I am extremely grateful if someone can point me how to go 
 about it. Thanks.

 Kelvin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Quotes - useFancyQuotes

2009-07-15 Thread Paulo E. Cardoso

I need to get the exact sentence, with the quotes:

 

OPTIONS COORDSYS(Surface1  AS COMPONENT);

 

but this:

 

options(useFancyQuotes = F)

paste(OPTIONS COORDSYS(, dQuote(Surface 1),  AS COMPONENT); )

 

give me this:

 

[1] OPTIONS COORDSYS( \Surface 1\ AS COMPONENT);

 

And it's not readable as SpatialSQL code by GIS

 

How to deal with this problem?



Paulo E. Cardoso

 

 



Paulo E. Cardoso

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error when sampling from SpatialLines

2009-07-15 Thread Roger Bivand


Which version of the sp package are you using? In a version that is submitted
to CRAN, and available for CVS checkout from the R-spatial sourceforge
repository, a weakness was addressed when a Line object received no sample
points, either because of its shortness, or because n is small for the total
length of points. I suggest that you try the latest version of sp from the
CVS repository.

Roger Bivand

PS. The traffic that you mention about sampling on lines was on the
R-sig-geo list, not this list, so a post there might have been a better
idea.

PPS. Always state package version numbers when reporting problems, for
example by giving sessionInfo() output. For errors, always also give the
output of traceback(), or al least its first lines, which may show where the
problem arises.



Elke Moons wrote:
 
 
 
 Dear forum, 
 
 
 
   
 
 
 
 I am working in R 2.9.1 and I am trying to sample locations from a network
 file. Reading in nor plotting is a problem, however when I am trying to
 sample from the file I get the following message: 
 
 
 
   
 
 
 
 nwlim-readShapeLines(C:/Limburg_nwshape,
 proj4string=CRS(+init=epsg:31300)) 
 
 
 
 plot(nwlim) 
 
 
 
   
 
 
 
 randacc-spsample(nwlim,n=1000,random) 
 
 
 
 Error in function (classes, fdef, mtable)  : 
 
 
 
   unable to find an inherited method for function coordinates, for
 signature numeric 
 
 
 
   
 
 
 
   
 
 
 
 I found a similar error message in the archives, where prof. dr. Bivand
 indicated to try to use readOGR instead of readShapeLines, however, I get
 the same error message. 
 
 
 
   
 
 
 
   
 
 
 
 nwlim2-readOGR(C:/Limburg_nwshape.shp, Limburg_nwshape) 
 
 
 
 OGR data source with driver: ESRI Shapefile 
 
 
 
 Source: C:/Limburg_nwshape.shp, layer: Limburg_nwshape 
 
 
 
 with  60745  rows and  24  columns 
 
 
 
 Feature type: wkbLineString with 2 dimensions 
 
 
 
   
 
 
 
 randacc-spsample(nwlim2,n=1000,random) 
 
 
 
 Error in function (classes, fdef, mtable)  : 
 
 
 
   unable to find an inherited method for function coordinates, for
 signature numeric 
 
 
 
   
 
 
 
   
 
 
 
 Can anyone help me with this issue? The shape file in itself is not so
 large (only a province was taken instead of the entire file of Belgium
 because of memory issues), only 8.11Mb. 
 
 
 
   
 
 
 
 Thanks in advance! 
 
 
 
   
 
 
 
 Elke 
 
 
 
   
 
 
 
 __
 
 
 
 Elke Moons, PhD
 
 
 
 Instituut voor Mobiliteit/Transportation Research Institute
 
 
 
 Universiteit Hasselt
 
 
 
 Wetenschapspark 5/Lokaal 1.10
 
 
 
 3590 Diepenbeek
 
 
 
 BELGIUM
 
 
 
 Tel. +32-11-26.91.26
 
 
 
 Fax +32-11-26.91.99
 
 
 
 E-mail: elke.mo...@uhasselt.be 
 
 
 
   
 
 
 
 
   [[alternative HTML version deleted]]
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Error-when-sampling-from-SpatialLines-tp24481180p24496005.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix multiplication precision

2009-07-15 Thread Douglas Bates

On Wed, Jul 15, 2009 at 3:42 AM, Nair, Murlidharan Tmn...@iusb.edu wrote:
 Hi!!

 I am trying to multiply 5 matrices and then using the inverse of that matrix 
 for further computation. I read about precision problems from the archives 
 and the suggestion was to use as.numeric while computing the products.

Hmm.  I'm not sure what the origins of that advice might be but I
don't think it would apply in general.

 I am still having problems with the results. Here is how I am using it

 #Mn.mat-(T.mat %*% Rz.mat %*% Q.mat %*% Rz.mat %*% T.mat)   # I was doing 
 this in one step which I have broken down into multiple steps as below.

 Mn1.mat-matrix(as.numeric(T.mat %*% Rz.mat),nrow=4)
 Mn2.mat-matrix(as.numeric(Mn1.mat %*% Q.mat),nrow=4)
 Mn3.mat-matrix(as.numeric(Mn2.mat %*% Rz.mat),nrow=4)
 Mn.mat-matrix(as.numeric( Mn3.mat %*% T.mat),nrow=4)

I doubt very much that the as.numeric would have any effect on
precision in these cases.  You are simply taking a numeric result in
its matrix form, converting it to a vector then converting the vector
back to a matrix.,

 mm - matrix(round(rnorm(8), 3), nrow = 4)
 mm
   [,1]  [,2]
[1,] -0.110 2.195
[2,]  0.616 0.353
[3,]  0.589 0.970
[4,]  1.028 0.857
 as.numeric(mm)
[1] -0.110  0.616  0.589  1.028  2.195  0.353  0.970  0.857

The key in situations like this is to analyze the structure of the
computation and decide whether you can group the intermediate results.
 You have written your product as

T %*% Rz %*% Q %*% Rz %*% T

What are the characteristics of T, Rz, and Q?  Are they square and
symmetric, square and triangular, diagonal?  Is Q orthogonal (the
factors in an orthogonal - triangular decomposition are often written
as Q and R)?  Did you happen to skip a few transposes in that formula
- often formulas like that generate symmetric matrices.

And the big lesson, of course, is the first rule of numerical linear
algebra., Just because a formula is written in terms of the inverse
of a matrix doesn't mean that is a good way to calculate the result;
in fact, it is almost inevitably the worst way of calculating the
result.  You only calculate the inverse of a matrix after you have
investigated all other possible avenues for calculating the result and
found them to be fruitless.

You haven't told us what you plan to do with the inverse and that is
an important consideration.,  If, for example, it represents a
variance-covariance matrix, and you want to express the result as
standard deviations and correlations you would not compute the
variance-covariance matrix but stop instead at the Cholesky factor of
the inverse.

You may want to check the facilities of the Matrix package for
expressing your calculation.  It is a recommended package that is
included with binary versions of R from 2.9.0 and has many ways of
expressing and exploiting structure in matrices.

 For getting the inverse I am doing the following

 Mn.mat.inv-qr.solve(Mn.mat)


 Will I run into precision problems when I do the above?

 Thanks ../Murli

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Re ad PNG file and display with more than 256 colors RGDAL

2009-07-15 Thread Roger Bivand


If you read the help page for the image method you are actually using:

?image.SpatialGridDataFrame

you see that it supports red=, green=, and blue= arguments. This suggests
that you could try:

myTile - readGDAL(basemap.png,silent=TRUE)
class(myTile)
summary(myTile)
# assuming the positions are 1, 2, 3
image(myTile, red=1, green=2, blue=3)

On the other hand, the colours in these images are arbitrary manipulations
of the sensors, and have already been warped and resampled, so decimating
the colour representation isn't inappropriate, and 256 is not a small
number.

Roger Bivand

PS. Detailed rgdal questions like this may get a faster response on
R-sig-geo


Paul.Rustomji wrote:
 
 Hello list
 
 I am trying to use a Googlemaps tile (png file, 640 X 640 px tile) as a
 background to a plot and have been using the rgdal library to read in the
 PNG file (modified from code in the RGoogleMaps package).  This works OK. 
 My problem is is that the SGDF2PCT function in rgdal seems to be limited
 to 256 colors and this is introducing (I think) some degradation to the
 image which appears on the output graph as speckling.
 
 Is there a way around this?  I have been reading about the EBImage package
 but it (a) seemed to have too many dependencies on other software for my
 purposes and (b) wouldn't load properly in any case on my machine
 (couldn't find dll's that it was looking for)
 
 
 Ideally I would like to read in the png image and display in its original
 form as a plot background, hopefully with all its colors represented
 rather than being converted to a 256 color scale.
 
 
 A code example is below:
 
   png(file=file.png, 4000,4000)
   par(mar=c(0,0,0,0))
   myTile - readGDAL(basemap.png,silent=TRUE); #basemap.png is a PNG
 file returned from googlemaps
   myt...@data - myt...@data[,1:3]
   col - SGDF2PCT(myTile,ncolors=256) ## myTile is a spatialGridDataFrame
 with 3 bands
   myTile$ind - col$idx ## add the colour index to the data frame
   myTile - as.image.SpatialGridDataFrame(myTile[ind],1,2)$z;
   attr(myTile, COL) - col$ct;
   attr(myTile, type) - rgb;
   myTile - list(lat.center, lon.center,NA, myTile)
 
   image(z=myTile[[4]], col = attr(myTile[[4]], COL))
 
   dev.off()
 
 sessionInfo()
 R version 2.9.1 (2009-06-26) 
 i386-pc-mingw32 
 
 locale:
 LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MONETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252
 
 attached base packages:
 [1] grid  stats graphics  grDevices utils datasets  methods  
 base 
 
 other attached packages:
 [1] abind_1.1-0rgdal_0.6-9sp_0.9-37  gridBase_0.4-3
 
 loaded via a namespace (and not attached):
 [1] lattice_0.17-25 tools_2.9.1
 
 
 Paul Rustomji
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Read-PNG-file-and-display-with-more-than-256-colors-RGDAL-tp24490663p24496356.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quotes - useFancyQuotes

2009-07-15 Thread Duncan Murdoch


On 15/07/2009 7:28 AM, Paulo E. Cardoso wrote:

I need to get the exact sentence, with the quotes:

 


OPTIONS COORDSYS(Surface1  AS COMPONENT);

 


but this:

 


options(useFancyQuotes = F)

paste(OPTIONS COORDSYS(, dQuote(Surface 1),  AS COMPONENT); )

 


give me this:

 


[1] OPTIONS COORDSYS( \Surface 1\ AS COMPONENT);

 


And it's not readable as SpatialSQL code by GIS

 


How to deal with this problem?


Your string is fine, it is only being displayed with escapes.  If you 
want to see it as it is, use cat(), not the default print().  For example,


 cat(OPTIONS COORDSYS( \Surface 1\ AS COMPONENT);\n)
OPTIONS COORDSYS( Surface 1 AS COMPONENT);


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] configure encoding by default

2009-07-15 Thread Угодай n/a

I want using russian letters for my diagrams. I do it in this manner

m - Ð·Ð°Ð³Ð¾Ð»Ð¾Ð²Ð¾Ðº
Encode(m) - UTF-8
plot(1,1,main=m)

But it is not convenient . How to configure R for using UTF-8 for all
string, to work without Encode-function, as

plot(1,1,main=Ð·Ð°Ð³Ð¾Ð»Ð¾Ð²Ð¾Ðº)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with merging matrices

2009-07-15 Thread jurgen claesen

Dear all,

I'm a relative new user of R and I have a problem with merging a collection
of matrices. All matrices in this collection have the same dimension (532
rows and 532 columns), but can differ in the row and columns names. I'd like
to merge these matrices in such a way that the resulting matrix only
contains unique row- and column-names and that in case of a match between
the row or column names the smallest value is retained.

As an example says more:

A1-matrix(c(1:9), ncol=3, byrow=TRUE)
rownames(A1)-colnames(A1)-c(a,b,c)
A1

  a b c
a 1 2 3
b 4 5 6
c 7 8 9

A2-matrix(c(7,1,3,10,2,7,3,1,8),ncol=3,byrow=TRUE)
rownames(A2)-colnames(A2)-c(a,y,z)
A2

   a y z
a  7 1 3
y 10 2 7
z  3 1 8


I want something like this to be returned:

   abcyz
a  1   2313
b  4   56NA NA
c  7   89NA NA
y  10 NA NA  56
z  3  NA  NA  89

Many thanks,

Jurgen

I'm using: R-2.9.1 (2009-06-26) on a windows XP platform.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (simple) xml into data.frame and reverse

2009-07-15 Thread Duncan Temple Lang



Thanks Dieter. It should have been

   http://www.omegahat.org/RSXML/xmlToDataFrame.R

as it is an R file.

  Thanks
   D.

Dieter Menne wrote:



Duncan Temple Lang wrote:



I wrote some relatively general functions, but hastily written functions
to read this sort of data. You can find them attached or at

   http://www.omegahat.org/RSXML/xmlToDataFrame.xml




Looks like that's the wrong link. I also did not find it mentioned in the
parent page.

Dieter




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Mixdist producing Na's and NaN's

2009-07-15 Thread Etienne B. Racine


When using mix() form the mixdist package, I sometimes get NA's and NaN's in
the Standard Errors estimates. I've understood from the code that mix() it
is using Non-linear minimization (nlm) to estimate thoses errors, but I
can't understand how to interpret theses results. Does any one have a (web)
reference to suggest or an interpretation ?

Thanks,
Etienne
-- 
View this message in context: 
http://www.nabble.com/Mixdist-producing-Na%27s-and-NaN%27s-tp24497125p24497125.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] configure encoding by default

2009-07-15 Thread Duncan Murdoch


On 7/15/2009 8:22 AM, Угодай n/a wrote:

I want using russian letters for my diagrams. I do it in this manner

m - Ð·Ð°Ð³Ð¾Ð»Ð¾Ð²Ð¾Ðº
Encode(m) - UTF-8
plot(1,1,main=m)

But it is not convenient . How to configure R for using UTF-8 for all
string, to work without Encode-function, as

plot(1,1,main=Ð·Ð°Ð³Ð¾Ð»Ð¾Ð²Ð¾Ðº)


You need to set your locale, either at the system level (and R will use 
it) or with Sys.setlocale() within a session.  The details of how to 
specify that you want UTF-8 characters vary by system; I'm not sure it's 
possible in Windows, for instance.  See the R Installation and 
Administration manual chapter on internationalization.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2: geom_errorbarh()

2009-07-15 Thread hadley wickham

In ggplot2, you'll want to use size.
Hadley

On Wed, Jul 15, 2009 at 12:36 PM, Benoit
Boulinguiezbenoit.boulingu...@ensc-rennes.fr wrote:
 Hi,

 I expect 'width' to set the width of the horizontal line of the whisker, as
 it does with errobar()

 Regards/Cordialement


 Benoit Boulinguiez


 -Message d'origine-
 De : hadley wickham [mailto:h.wick...@gmail.com]
 Envoyé : dimanche 12 juillet 2009 11:04
 À : Benoit Boulinguiez
 Cc : r-help@r-project.org
 Objet : Re: [R] ggplot2: geom_errorbarh()

 Hi Benoit,

 What do you expect width to do?  You are already setting the left and right
 extents with xmin and xmax.

 Hadley

 On Thu, Jul 9, 2009 at 10:37 AM, Benoit
 Boulinguiezbenoit.boulingu...@ensc-rennes.fr wrote:
 Hi all,

 quick question: is the optional command width effective in the
 geom_errorbarh() layer of ggplot?
 Cause I can't get it works on this graph
 http://www.4shared.com/file/116919103/93488d88/iso_2PrsH.html


 pdf(file = iso_2PrsH.pdf, width = 7, height = 7)
 NC60.iso.graph-ggplot(
  NC60.DATA
  ,aes(Ce,Qe)) +
  geom_point(col=MaCouleur1, size=4) +

  geom_errorbar(
  aes(ymax = NC60.DATA$Qe+NC60.DATA$sdQe
   ,ymin=NC60.DATA$Qe-NC60.DATA$sdQe)
   ,colour=alpha(black,0.4)
   ,width=1) +

  geom_errorbarh(
  aes(xmax = NC60.DATA$Ce+NC60.DATA$sdCe
   ,xmin=NC60.DATA$Ce-NC60.DATA$sdCe)
   ,colour=alpha(black,0.4)
   ,width=1) +

  geom_line(data=NC60.Res4.curve
  ,aes(x,y)
  ,size=1
  ,colour=alpha(black,0.5)) +
  xlab(C[e]~(mmol/m^3)) +
  ylab(q[e]~(mmol/m^3))

 print(NC60.iso.graph)
 dev.off()

 Regards/Cordialement

 -
 Benoit Boulinguiez
 Ph.D student
 Ecole de Chimie de Rennes (ENSCR) Bureau 1.20 Equipe CIP UMR CNRS 6226
 Sciences Chimiques de Rennes
 Avenue du Général Leclerc
 CS 50837
 35708 Rennes CEDEX 7
 Tel 33 (0)2 23 23 80 83
 Fax 33 (0)2 23 23 81 20
  http://www.ensc-rennes.fr/ http://www.ensc-rennes.fr/



        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 --
 http://had.co.nz/






-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] hi friends, is there any wait function in R

2009-07-15 Thread Petr PIKAL

Hi

deepak m r deepak...@gmail.com napsal dne 14.07.2009 16:07:16:

 Hi,
I am not an expert to debug the R can u please help.
 best regards
 deepak

See
fortune(brain)

You does not need to debug R. I believe R-Core will do it for you (If you 
really find some bug in R, which seems to be highly improbable).

Instead see which plots are wrong and try to look at your data by any 
suitable functions provided by R itself (str(), ls(), head().

In the first sight

1:length(paste(alp,s[t],_mean,sep=))

I expect to get 1. What do ***you*** expect to get?

Regards
Petr

 
 On Tue, Jul 14, 2009 at 10:06 AM, deepak m rdeepak...@gmail.com wrote:
  Hi,
 
 
  On Tue, Jul 14, 2009 at 9:38 AM, Duncan Murdochmurd...@stats.uwo.ca 
wrote:
  On 7/14/2009 8:56 AM, deepak m r wrote:
 
  Hi,
 Empty plot is getting i dont know why. can u please clarify how
  can i use Print function instead of plot function.
 
  You need print() if you are using grid-based graphics (lattice, 
ggplot2,...)
  in a script.  You are using classic graphics so it should not be 
necessary.
 
  If you are getting blank plots when you shouldn't, that's a bug.  If 
you can
  put together a reproducible example that shows its a bug in R, rather 
than a
  bug in your script, it will likely be fixed fairly quickly.
 
  Duncan Murdoch
 
 
  best regards
  deepak
 
  On Tue, Jul 14, 2009 at 7:48 AM, Petr PIKALpetr.pi...@precheza.cz 
wrote:
 
  Hi
 
  For this type of problems I do multipage pdf.
 
  pdf(file, )
  for (i in ...) {
  do all stuff including plot
  }
  dev.off()
 
  and then check the plots afterwards. Recently there was some post 
about
  how to wait but you do not want only wait you want also to 
interactively
  change plotting parameters, won't you.
 
  cat(\n,Enter x,\n) # prompt
  x=scan(n=1) # read 1 line from console
 
  this construction print something on console and reads one line 
from
  console. There are also some packages which leave you choose from 
several
  options. I think in car and randomForest are such routines so you 
could
  check how it is done.
 
  Regards
  Petr
 
  r-help-boun...@r-project.org napsal dne 14.07.2009 13:17:01:
 
  hi,
is there any wait function in R. I am running one R script to 
plot
  many graphs it is in the for loop. its showing no error but its 
not
  plotting well I think i can solve this problem with a wait 
function.
  Please help me in this regards. If u need any clarification about
  programme. u can find the script below.
 
  best regards,
  Deepak.M.R
  Biocomputing Group
  University of Bologana.
 
 
  #!/usr/bin/R
  s-c
 
 
  
 
(GG,GA,GV,GL,GI,GM,GF,GW,GP,GS,GT,GC,GY,GN,GQ,GD,GE,GK,GR,GH,AA,AV,AL,AI,AM,AF,AW,AP,AS,AT,AC,AY,AN,AQ,AD,AE,AK,AR,AH,VV,VL,VI,VM,VF,VW,VP,VS,VT,VC,VY,VN,VQ,VD,VE,VK,VR,VH,LL,LI,LM,LF,LW,LP,LS,LT,LC,LY,LN,LQ,LD,LE,LK,LR,LH,II,IM,IF,IW,IP,IS,IT,IC,IY,IN,IQ,ID,IE,IK,IR,IH,MM,MF,MW,MP,MS,MT,MC,MY,MN,MQ,MD,ME,MK,MR,MH,FF,FW,FP,FS,FT,FC,FY,FN,FQ,FD,FE,FK,FR,FH,WW,WP,WS,WT,WC,WY,WN,WQ,WD,WE,WK,WR,WH,PP,PS,PT,PC,PY,PN,PQ,PD,PE,PK,PR,PH,SS,ST,SC,SY,SN,SQ,SD,SE,SK,SR,SH,TT,TC,TY,TN,TQ,TD,TE,TK,TR,TH,CC,CY,CN,CQ,CD,CE,CK,CR,CH,YY,YN,YQ,YD,YE,YK,YR,YH,NN,NQ,ND,NE,NK,NR,NH,QQ,QD,QE,QK,QR,QH,DD,DE!
 
  
 ,DK,DR,DH,EE,EK,ER,EH,KK,KR,KH,RR,RH,HH)
  for(t in 1:length(s))
{
 
 
  
a-read.table(paste(../All_alpha_proteins/alp,s[t],mean.sat,sep=),header=T)
 
attach (a)
names(a)
al-1:length(paste(alp,s[t],_mean,sep=))
 
 
  
b-read.table(paste(../All_beta_proteins/bet,s[t],mean.sat,sep=),header=T)
 
attach(b)
names(b)
bl-1:length(paste(bet,s[t],_mean,sep=))
p-read.table(paste(../Alpha_and_beta_proteins_a+b/apb,s
  [t],mean.sat,sep=),header=T)
attach(p)
names(p)
pl-1:length(paste(apb,s[t],_mean,sep=))
o-read.table(paste(../Alpha_and_beta_proteins_aorb/aob,s
  [t],mean.sat,sep=),header=T)
attach(o)
names(o)
ol-1:length(paste(aob,s[t],_mean,sep=))
postscript(file=paste(Mean_,s[t],_Plot.ps,sep=))
plot(0,xlim=c(0,160),ylim=c(0,70),main=paste(Mean Distance Plot 
for
  ,s[t], Rsidue pair,sep=),ylab=Mean Distance in
  Angstrom,xlab=Residue Seperation Number)
lines(al,paste(alp,s[t],_mean,sep=),col=blue,lty = 2)
lines(bl,paste(bet,s[t],_mean,sep=),col=yellow,lty = 2)
lines(pl,paste(apb,s[t],_mean,sep=),col=red,lty = 2)
lines(ol,paste(aob,s[t],_mean,sep=),col=green,lty = 2)
legend(topleft,c(Alpha Proteins,Beta Proteins,Alpha +
  Beta,Alpha or Beta),lty =
  c(2,2,2,2),col=c(blue,yellow,red,green))
dev.off()
}
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 
  http://www.R-project.org/posting-guide.html
 
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide

Re: [R] nls, reach limit bounds

2009-07-15 Thread Ravi Varadhan

Hi Uyen,

In addition to my suggestions in the previous email, you may also want to
check to see if the LL.4 function in the drc package has the same
parametrization as the function that you are using.  Different
parametrizations of the same model can have dramatically different
convergence behavior in nonlinear least squares problem.  

Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Ravi Varadhan
Sent: Tuesday, July 14, 2009 7:35 PM
To: 'UyenThao Nguyen'; 'spencerg'
Cc: r-help@r-project.org
Subject: Re: [R] nls, reach limit bounds

I took a quick look at drcpackage and the drm function.  The drm()
function uses optim (BFGS method).  So, that is one diffference.  However,
without looking at your code on how you used drm(), I cannot tell further.  

The fact that you got an answer using optim() does not necessarily mean that
everything is swell.  Did you check the Hessian to see if it is
positive-definite?

You might also want to try the nls.lm() function in the minpack.lm
package.

Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: UyenThao Nguyen [mailto:ungu...@tethysbio.com]
Sent: Tuesday, July 14, 2009 7:07 PM
To: Ravi Varadhan; 'spencerg'
Cc: r-help@r-project.org
Subject: RE: [R] nls, reach limit bounds

Hi Ravi and Spencer,

Thank you very much for your help.

I did plot the data, and saw that the data didn't seem to have an inflection
point. Yes, the data contained 6 points of duplicates, which the 4 P
logistic regression is appropriate to use. 

I tried the dose response model (drm in drc package), this model works
without any problem. Do you know if the drm has different tolerance in
convergent or something else? Let's say if I choose drm to fit the data, Can
I get the parameters in the same way nls gives me? I tested a converged data
set on both drm and nls, and I can't see any relationship between their
parameters although the fits were similar.

Thank you.
Uyen

-Original Message-
From: Ravi Varadhan [mailto:rvarad...@jhmi.edu]
Sent: Monday, July 13, 2009 3:32 PM
To: 'spencerg'; UyenThao Nguyen
Cc: r-help@r-project.org
Subject: RE: [R] nls, reach limit bounds

Hi Uyen, 

You do not have enough information to estimate 4 parameters in your
nonlinear model.  Even though you have 12 data points, only 6 are
contributing independent information (you essentially have 6 replicate
points).  If you plot the fittted function you will see that it fits your
data really well.  However, you will also see that the fitted curve is far
from reaching the asymptote.  An implication of this is that you cannot
reliably estimate b0 and b1 separately.  So, you need more data, especially
for larger x values in the asymptotic region.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of spencerg
Sent: Saturday, July 11, 2009 9:50 PM
To: UyenThao Nguyen
Cc: r-help@r-project.org
Subject: Re: [R] nls, reach limit bounds

  Have you plotted the data?  There are two standard sources of
non-convergence problems like this:  First, there may not be enough
information in your data to estimate all four parameters.  Second, if that's
not the case, then  your starting values are not appropriate. 


  I routinely use optim or nlminb to find a sensible solution,
because these general purpose optimizers are more likely to converge than
nls.  To do this, I write a function with a name like SSElogistic to
compute the sum of squares of residuals for your model.

[R] Question related to merging zoo objects and apply function

2009-07-15 Thread Sergey Goriatchev

Hello everyone,

Say I have defined a convenience function in my code like this:

func - function(x, j){
x[168+j] - x[72+j]+x[144+j]
}

And now I apply it to some columns in a big zoo object like this:


for (m in 1:24){
combined - merge(combined, LA1sum=apply(combined, 1, func, j=m))
}

output of this for-loop will be zoo object with columns named
LA1sum.1, LA1sum.2, ..., LA1.sum24.

If I have a vector of names like this:

namesVec - c(LA1sum, LP1sum, GC1sum, LL1sum, LN1sum,
SI1sum, LX1sum, CO1sum, CL1sum, QS1sum, HO1sum,
NG1sum, XB1sum, C.1sum, FC1sum, LH1sum, LC1sum, S.1sum,
W.1sum, KW1sum, CC1sum, KC1sum, CT1sum, SB1sum)

What I want is in merge() for each m the name of new column to be one
of the elements in namesVec, that is
for m=1 I have LA1sum=apply(...,j=1)
for m=2 I have LP1sum=apply(...,j=2)
etc

What it the way to do this?

Thank you in advance for your help!

Regards,
Sergey

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] DECLARING A PANEL VARIABLE???

2009-07-15 Thread saurav pathak

Hi
I am working on a panel data, my data are clustered/grouped by the variable
yearctry, I am running the regression below, but I cant make the
regression recognise yearctry as the panel variable in the regression

myProbit- glm(s ~ age + gender + gemedu + gemhinc + es_gdppc +
imf_pop + estbbo_m, family = binomial(link = probit), data =
adpopdata)

Can anyone help me do this please???

Thanks
Saurav

-- 
Dr.Saurav Pathak
PhD, Univ.of.Florida
Mechanical Engineering
Doctoral Student
Innovation and Entrepreneurship
Imperial College Business School
s.patha...@imperial.ac.uk
0044-7795321121

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simple cat statement - output truncated

2009-07-15 Thread rkevinburton

I have a statement:

cat(myforecast ETS(, paste(object$components[1], object$components[2], 
object$components[3], object$components[4], sep = ,), ) , n, \n)

That generates:

cast ETS( A,N,N,FALSE )  3 

Anyone guess as to why the first 5 letters are truncated/missing?

Kevin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Grouping data in dataframe

2009-07-15 Thread John Kane


Another approach is to use the reshape package 

--Assuming your data.frame is called xx
--
libarary(reshape)
mm - melt(xx, id=c(Size)) ; mm
cast(mm, Size ~variable, median)
--

--- On Tue, 7/14/09, Timo Schneider timo.schnei...@s2004.tu-chemnitz.de wrote:

 From: Timo Schneider timo.schnei...@s2004.tu-chemnitz.de
 Subject: [R] Grouping data in dataframe
 To: r-help@r-project.org r-help@r-project.org
 Received: Tuesday, July 14, 2009, 11:56 PM
 Hello,
 
 I have a dataframe (obtained from read.table()) which looks
 like
 
  
    ExpA   ExpB   ExpC   Size
 1      12     23 
   33      1
 2      12     24 
   29      1
 3      10     22 
   34      1
 4      25     50 
   60      2
 5      24     53 
   62      2
 6      21     49 
   61      2
 
 now I want to take all rows that have the same value in the
 Size
 column and apply a function to the columns of these rows
 (for example
 median()). The result should be a new dataframe with the
 medians of the
 groups, like this:
 
  
    ExpA   ExpB   ExpC   Size
 1      12     23 
   34      1
 2      24     50 
   61      2
 
 I tried to play with the functions by() and tapply() but I
 didn't get
 the results I wanted so far, so any help on this would be
 great!
 
 The reason why I am having this problem: (I explain this
 just to make
 sure I don't do something against the nature of R.)
 
 I am doing 3 simillar experiments, A,B,C and I change a
 parameter in the
 experiment (size). Every experiment is done multiple times
 and I need
 the median or average over all experiments that are the
 same. Should I
 preprocess my data files so that they are completely
 different? Or is it
 fine the way it is and I just overlooked the simple
 solution to the
 problem described above?
 
 Regards,
 Timo
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.
 


  __
Make your browsing faster, safer, and easier with the new Internet Explorer® 8. 
Optimized for Yahoo! Get it Now for Free! at 
http://downloads.yahoo.com/ca/internetexplorer/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple cat statement - output truncated

2009-07-15 Thread Duncan Murdoch


On 7/15/2009 9:53 AM, rkevinbur...@charter.net wrote:

I have a statement:

cat(myforecast ETS(, paste(object$components[1], object$components[2], object$components[3], 
object$components[4], sep = ,), ) , n, \n)

That generates:

cast ETS( A,N,N,FALSE )  3 


Anyone guess as to why the first 5 letters are truncated/missing?


You are probably being punished for posting non-reproducible code*.

When I try a reproducible version of the line above, things look fine:

 cat(myforecast ETS(, paste(A,N,N,FALSE, sep = ,), ) , 3, 
\n)

myforecast ETS( A,N,N,FALSE )  3


Duncan Murdoch

* R has a new predictive punishment module.  It punishes you for things 
it knows you will do later.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plotting hourly time-series data loaded from file using plot.ts

2009-07-15 Thread Keith

Hello everyone,

I am just a tyro in R and would like your kindly help for some
problems which I've been struggling for a while but still in vain.

I have a time-series file (with some missing value ) which looks like

time[sec] , Factor1 , Factor2
00:00:00 01.01.2007 , 0. , 0.176083
01:00:00 01.01.2007 , 0. , 0.176417
[ ... ]
11:00:00 10.06.2007 , 0. , 0.148250
12:00:00 10.06.2007 , NA , 0.147000
13:00:00 10.06.2007 , NA , 0.144417
[ ... ]

and I would like to do some basic time-series analyses using R. The
first idea is to plot these time-series events and the main problem
was the handling of the date/time format in the 1st column. I was
using the script below to deal with:

data - 
read.table(file,header=TRUE,sep=,,colClasses=c(character,numeric,numeric))
data$time.sec. - as.POSIXct(data$time.sec.,format=%H:%M:%S %d.%m.%Y)
dataTs - as.ts(data)
plot.ts(dataTs)

Then, the plot showed up with 3 subplots in one plot. The 1st is the
linear line with the x-axis being just the sequence of orders and
y-axis being wrong numbers which is completely wrong. The 2nd and the
3rd are correct but the x-axis is still wrong. Does anyone know how to
plot correct Factor1 and Factor2 with respect to the 1st column time
format? Or, probably should I use some other packages? Besides, how
can I plot these two time-series data (Factor1 and Factor2) in two
separate plots?

Best regards,
Keith

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix multiplication precision

2009-07-15 Thread Steve Lianoglou


Hi Douglas,


And the big lesson, of course, is the first rule of numerical linear
algebra., Just because a formula is written in terms of the inverse
of a matrix doesn't mean that is a good way to calculate the result;
in fact, it is almost inevitably the worst way of calculating the
result.  You only calculate the inverse of a matrix after you have
investigated all other possible avenues for calculating the result and
found them to be fruitless.


As a relative noob to the nitty gritty details of numerical linear  
algebra, could you elaborate on this a bit (even if it's just a link  
to a book/reference that explains these issues in more detail)?


Where/what else should we be looking to do that would be better? Or  
are there really no general rules, and the answer just depends on the  
situation?


Thanks,

-steve

--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University

Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (simple) xml into data.frame and reverse

2009-07-15 Thread stefan.d...@gmail.com

Hi Duncan,
thanks for the advice and the link. I just realized there is a mistake
in the code I posted and I currently don't have the time to correct
it. So I can not try your function (yet). But I will do so in the next
few days and tell you if it doesnt work.
Thanks again and best,
Stefan

On Wed, Jul 15, 2009 at 2:37 PM, Duncan Temple
Langdun...@wald.ucdavis.edu wrote:

 Thanks Dieter. It should have been

   http://www.omegahat.org/RSXML/xmlToDataFrame.R

 as it is an R file.

  Thanks
   D.

 Dieter Menne wrote:


 Duncan Temple Lang wrote:


 I wrote some relatively general functions, but hastily written functions
 to read this sort of data. You can find them attached or at

   http://www.omegahat.org/RSXML/xmlToDataFrame.xml



 Looks like that's the wrong link. I also did not find it mentioned in the
 parent page.

 Dieter



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Change data frame column names

2009-07-15 Thread Tom Liptrot





Hi R helpers,

I have a data frame and I want to change the column names to names I have held 
in another data frame, but I am having difficulty. All data framnes are large 
so i can't input manually. Below is what i have tried so far:

df-data.frame(a=1:5, b=2:6, d=3:7, e=4:8)
coltitles-data.frame(v1=col number one, v2=col number two, v3=col number 
three, v4=col number four)

##first attempt

names(df)-coltitles
names(df)
[1] 1 1 1 1   ###not what i wanted as I want names(df) to return [1] 
col number one col number two col number three col number four



##second attempt

coltitles-as.vector(coltitles, mode=character)  ##trying to convert to a 
character vector after looking at help
is.vector(coltitles)
[1] TRUE
names(df)-coltitles
names(df)
[1] 1 1 1 1   ###again not what I wanted

How can I convert the column names?

Thanks in advance,

Tom

Beyond Hotmail - see what else you can do with Windows Live. Find out more.
_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Spaces in a name

2009-07-15 Thread Idgarad

I am reading regressors from an excel file (I have no control over the file)
and some of the element names have spaces:

i.e. Small Bank Aquired

but I have found that lm(SourceData ~ . - Small Bank Aquired, mcReg)
doesn't work (mcReg = modelCurrentRegressors)

As they are toggles I have ran them through factor() to be treated propertly
as 0 or 1 but due to the fact I am grabbing automagically the first 2/3rds
of the data some of the regressors are either all 0s or all 1s accordingly
so I need to take them out of the model by hand for now until I find a nice
automatic method for removing regressors that only have 1 factor.

So Primarily: how do I handle names that include spaces in this context and
as a bonus: Anyone have a nice method for yanking regressors that only have
a single factor in them from the lm() function?


e.g. (for the following 30 elements)
0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,1,1

As you can see grabbing the first 2/3rds is all 0s and the last 1/3rd is all
ones (doing in-sample forecast diagnostic building the model only on the
first 2/3rds of data, then forecasting the next 1/3rd and comparing.)

Sorry if I am rambling a bit, still on cup of coffee #1...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix multiplication precision

2009-07-15 Thread roger koenker


A good discussion of this is provided by Gill, Murray and Wright
Num Lin Alg and Opt, section 4.7.2.


url:www.econ.uiuc.edu/~rogerRoger Koenker
emailrkoen...@uiuc.eduDepartment of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Urbana, IL 61801



On Jul 15, 2009, at 9:27 AM, Steve Lianoglou wrote:


Hi Douglas,


And the big lesson, of course, is the first rule of numerical linear
algebra., Just because a formula is written in terms of the inverse
of a matrix doesn't mean that is a good way to calculate the result;
in fact, it is almost inevitably the worst way of calculating the
result.  You only calculate the inverse of a matrix after you have
investigated all other possible avenues for calculating the result  
and

found them to be fruitless.


As a relative noob to the nitty gritty details of numerical linear  
algebra, could you elaborate on this a bit (even if it's just a link  
to a book/reference that explains these issues in more detail)?


Where/what else should we be looking to do that would be better? Or  
are there really no general rules, and the answer just depends on  
the situation?


Thanks,

-steve

--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University

Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] predictive punishment module (was: Re: Simple cat statement - output truncated)

2009-07-15 Thread S Ellison



 Duncan Murdoch murd...@stats.uwo.ca 15/07/2009 15:04:29 
* R has a new predictive punishment module.  It punishes you for
things 
it knows you will do later.
Dang! Does that mean it'll return errors for the statistical idiocy I
haven't yet got around to committing? 

;-)

Steve E

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change data frame column names

2009-07-15 Thread Duncan Murdoch


On 7/15/2009 10:35 AM, Tom Liptrot wrote:




Hi R helpers,

I have a data frame and I want to change the column names to names I have held 
in another data frame, but I am having difficulty. All data framnes are large 
so i can't input manually. Below is what i have tried so far:

df-data.frame(a=1:5, b=2:6, d=3:7, e=4:8)
coltitles-data.frame(v1=col number one, v2=col number two, v3=col number three, 
v4=col number four)


It would be simpler to use

coltitles - c((v1=col number one, v2=col number two, v3=col number 
three, v4=col number four)


(and the v1=, v2=, etc. are redundant).  Then your first attempt would work.



##first attempt

names(df)-coltitles
names(df)
[1] 1 1 1 1   ###not what i wanted as I want names(df) to return [1] col number one col 
number two col number three col number four


If you are taking titles from one data frame to put on another as in 
your original code, then use


names(df) - names(coltitles)

Duncan Murdoch




##second attempt

coltitles-as.vector(coltitles, mode=character)  ##trying to convert to a 
character vector after looking at help
is.vector(coltitles)
[1] TRUE
names(df)-coltitles
names(df)
[1] 1 1 1 1   ###again not what I wanted

How can I convert the column names?

Thanks in advance,

Tom

Beyond Hotmail - see what else you can do with Windows Live. Find out more.
_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The greatest common divisor between more than two integers

2009-07-15 Thread Atte Tenkanen

Thanks! I try that. There is in some packege such a function.

Atte

 On Wed, Jul 15, 2009 at 8:55 AM, Atte Tenkanenatte...@utu.fi wrote:
 
  Do somebody know if there is a function in R which  computes the 
 greatest common divisor between several (more than two) integers?
 
 Is there a function for computing the greatest common divisor of *two*
 numbers? I can't find one, but assume that there is such a function,
 and call it gcd. Then you could define a recursive function to do the
 job. Something like
 
 new_gcd = function(v)
 {
if (length(v)==2) return(gcd(v))
else return (new_gcd(v[1],new_gcd(v[2:length(v)]))
 }
 
 where v is a vector containing the numbers you want to calculate the
 greatest common divisor of.
 
 -- 
 Michael Knudsen
 micknud...@gmail.com
 http://lifeofknudsen.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bug in package.skeleton, R 2.9.0?

2009-07-15 Thread Daniel Klevebring

Bump

Anyone?

Thanks
Daniel

On 13 jul 2009, at 10.57, Daniel Klevebring wrote:

 Dear all,

 I am using package.skeleton to build a small packages of misc function
 for personal use. I have recently discovered that the option
 force=TRUE doesn't seem to do what is meant to do. Here's what I'm
 doing:

 setwd(/Users/danielk/Documents/R/packages/dk)
 files - paste(codebase, dir(codebase, pattern=.R), sep=/)
 package.skeleton(name=dk, force=TRUE, code_files=files)
 Creating directories ...
 Creating DESCRIPTION ...
 Creating Read-and-delete-me ...
 Copying code files ...
 Making help files ...
 Done.
 Further steps are described in './dk/Read-and-delete-me'.


 Now, everything seems fine, but changes to files in me codebase
 folder, doesn't come along if the folder dk/R already contains the
 files, even though I use force=TRUE. If I remove the dk/R folder or
 the dk folder altogether, the changes come along so to me it seems
 that it's the overwrite part that doesn't work as it should - or am I
 doing something wrong here? To me, it seems that the function
 safe.dir.create (which is defined in package.skeleton never overwrites
 folders, yielding force=TRUE useless.

 See below for sessionInfo.

 Thanks a bunch
 Daniel



 sessionInfo()
 R version 2.9.0 (2009-04-17)
 i386-apple-darwin8.11.1

 locale:
 en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 loaded via a namespace (and not attached):
 [1] tools_2.9.0



 --

 Contact information:

 Daniel Klevebring
 M. Sc. Eng., Ph.D. Student
 Dept of Gene Technology
 Royal Institute of Technology, KTH
 SE-106 91 Stockholm, Sweden

 Visiting address: Roslagstullsbacken 21, B3
 Delivery address: Roslagsvägen 30B, 104 06, Stockholm
 Invoice address: KTH Fakturaserice, Ref DAKL KTHBIO, Box 24075,
 SE-10450 Stockholm
 E-mail: dan...@biotech.kth.se
 Phone: +46 8 5537 8337 (Office)
 Phone: +46 704 71 65 91 (Mobile)
 Web: http://www.biotech.kth.se/genetech/index.html
 Fax: +46 8 5537 8481

 --

 Contact information:

 Daniel Klevebring
 M. Sc. Eng., Ph.D. Student
 Dept of Gene Technology
 Royal Institute of Technology, KTH
 SE-106 91 Stockholm, Sweden

 Visiting address: Roslagstullsbacken 21, B3
 Delivery address: Roslagsvägen 30B, 104 06, Stockholm
 Invoice address: KTH Fakturaserice, Ref DAKL KTHBIO, Box 24075,
 SE-10450 Stockholm
 E-mail: dan...@biotech.kth.se
 Phone: +46 8 5537 8337 (Office)
 Phone: +46 704 71 65 91 (Mobile)
 Web: http://www.biotech.kth.se/genetech/index.html
 Fax: +46 8 5537 8481
 MSN messenger: klevebr...@msn.com


   [[alternative HTML version deleted]]

 ATT1.txt

--

Contact information:

Daniel Klevebring
M. Sc. Eng., Ph.D. Student
Dept of Gene Technology
Royal Institute of Technology, KTH
SE-106 91 Stockholm, Sweden

Visiting address: Roslagstullsbacken 21, B3
Delivery address: Roslagsvägen 30B, 104 06, Stockholm
Invoice address: KTH Fakturaserice, Ref DAKL KTHBIO, Box 24075,  
SE-10450 Stockholm
E-mail: dan...@biotech.kth.se
Phone: +46 8 5537 8337 (Office)
Phone: +46 704 71 65 91 (Mobile)
Web: http://www.biotech.kth.se/genetech/index.html
Fax: +46 8 5537 8481
MSN messenger: klevebr...@msn.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Is it possible to use EGARCH and GJR in R?

2009-07-15 Thread Andriy Fetsun

Hi,

  Could you please help me with EGARCH and GJR?
   Is it possible to use EGARCH and GJR in R? I have used below mentioned
code
   for GARCH in R, but I never used EGARCH and GJR in R.
   Thank you in advance!
   daten-read.table(H://Daten//Zeitreihen//dax_1.csv, sep=;, header=T)
   DAX.kurs-daten
   DAX.kurs-ts(DAX.kurs,names=DAX-Kurs)
   DAX.rendite-diff(DAX.kurs)/DAX.kurs[1:length(DAX.kurs)-1]
   g-garchFit(data=DAX.rendite,formula=~garch(1,1),const.dist=dstd)
   g.predict-predict(g,n.ahead=1)
   Best regards,
   Andy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [R-sig-Geo] how to work with useFancyQuotes( ) to avoid \ quotes

2009-07-15 Thread Jaime R. Garcia Marquez


Try this:

options(useFancyQuotes = F)
print(paste(OPTIONS COORDSYS(, dQuote(Surface 1),  AS COMPONENT);  
),quote=F)


HTH,

Jaime -R

On Wed, 15 Jul 2009 13:15:32 +0200, Paulo E. Cardoso  
pecard...@netcabo.pt wrote:



I need to get the exact sentence, with the quotes:

OPTIONS COORDSYS(Surface1  AS COMPONENT);

but this:

options(useFancyQuotes = F)
paste(OPTIONS COORDSYS(, dQuote(Surface 1),  AS COMPONENT); )

give me this:

[1] OPTIONS COORDSYS( \Surface 1\ AS COMPONENT);

And it's not readable as SpatialSQL code by GIS

How to deal with this problem?

Paulo E. Cardoso

___
R-sig-Geo mailing list
r-sig-...@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nagelkerkes R2N

2009-07-15 Thread North, Bernard V



I am interested Andrea is whether you ever established why your R2 was 1.
I have had a similar situation previously.

My main issue though, which I'd be v grateful for advice on, is why I am 
obtaining such  negative values -0.3  for Somers Dxy  using validate.cph from 
the Design package given my value of Nagelkerke R2 is not so low 13.2%.

I have this output when fitting 6 variables all with p-values0.01
I am wondering what the interpretation should be.
I know my Nagelkerke R2 isn't very good but I compare my results with the 
example from ?validate.cph and although I have a better R2 (13% v 9%) the 
Somers dxy from the example data set is much better, 38%, so certainly not 
negative !

 So my main question is : Why such a difference between explained variation, 
R2, and predictive ability: somers dxy ??

Obs Events Model L.R.   d.f.  P  ScoreScore P R2
   471228  66.36  6  0  73.41  0
  0.132



  validate(f, B=150,dxy=T)   # normally B=150

 index.orig  training test optimism index.corrected   n
Dxy   -0.3022537331 -0.3135032097 -0.292492573 -0.021010636   -0.2812430968 150
R2 0.1319445685  0.1431179294  0.122599605  0.0205183240.1114262446 150
Slope  1.00  1.00  0.923340558  0.0766594420.9233405576 150
D  0.0250864459  0.0276820092  0.023163167  0.0045188420.0205676038 150
U -0.0007676033 -0.0007725071  0.000610456 -0.0013829630.0006153598 150
Q  0.0258540493  0.0284545164  0.022552711  0.0059018050.0199522440 150


I also calculated the Schemper and Henderson V measure and obtained v=10.5%

I was  using the surev package of Lusa Lara; Miceli Rosalba; Mariani 
LuigiEstimation of predictive accuracy in survival analysis using R and 
S-PLUS.http://www.biomedexperts.com/Abstract.bme/17601627/Estimation_of_predictive_accuracy_in_survival_analysis_using_R_and_S-PLUS
Computer methods and programs in biomedicine 2007;87(2):132-7.
And my code was
library(surev)
pred.accuracy-f.surev(f)
pred.accuracy

sorry if my question isn't clear - should I have included my sessionInfo for a 
methodological question ? (I'm a newbie)
many thanks for any advice

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problems with computing an aggregated score

2009-07-15 Thread Hatsch


Hi all,

I have a problem with computing a new variable as an aggregated score of
other variables.

Say, I have 10 variables for 1,000 observations (people). Every variable may
have values between 0 and 8. What I would like to do is computing the mean
of the individual top 3 values for every person.

Exampe: The values for the 10 variables (v1 to v10) for person A, B and C
are as follows:

A 0 1 0 2 5 8 3 0 4 0
B 6 4 3 0 0 0 0 5 0 0
C 0 0 8 0 0 8 0 0 0 0

So, I would like to compute a new variable for the mean of the individual
top 3 values, which would result for person A, B and C in the following:

A (8+5+4)/3 = 5.67
B (6+5+4)/3 = 5
C (8+8+0)/3 = 5.33

Is there any way to do this?

Any clues, hints and suggestions are highly appreciated,
many thanks in advance
Johannes
-- 
View this message in context: 
http://www.nabble.com/Problems-with-computing-an-aggregated-score-tp24495390p24495390.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] tkbutton and functions help

2009-07-15 Thread KO KE


 Hey
 there,
 
 I’m quite lost in the R tcltk package
 and would love to get some help.
 
 My
 problem is this:
 
 
 
 Total_substes
 = 5 # example
 
 subcounter=1
 
 
 # while the maximum number of subsets is not reached
 put buttons onto the tt widget
 
 while (subcounter
 = total_subsets) {
 
    
 subname =
 paste(subset , 1:total_subsets, sep
 =
 )
 
    
 subnameasstring2 - subname[subcounter]
 
    
 toString(subnameasstring2)
 
    
 subname.button = paste(subnameasstring2,.button, sep = )
 
    
 subname.button - tkbutton(tt,text=subnameasstring2,
 command=function() iconfigure(vhight, vwidth, vanzheight, vanzwidth,
 total_number_iplots_onscreen,
 subcounter))
 
    
tkgrid(subname.button)

   
subcounter
= subcounter
+1

}   

I
create dynamic buttons but when I press them the value of
i.e. subcounter is overritten by the last value
assignment
 
Button
subset_1.button gives me in this case when I press it 6 and
that’s obviously wrong because it should be 1.
 

Has
anybody an idea how I can solve that? 

Btw:
The goal is to spread isets onto the screen (but a lot of
them) 





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] time series fiting and residual computing

2009-07-15 Thread Yogesh Tiwari

Dear 'R' Users,

I have CO2 timeseries which I want to fit with fourth harmonic function and
then would like to compute residuals.
I tried with two option but not sure which one is correct, kindly any one
can help correcting me:

#
#I would like to fit a model with Fourth harmonic function as:

#m(t)=a+bt+ct^2+c1sin(omega1*t)+d1cos(omega1.t)+c2sin(omega2.t)+d2cos(omega2.t)+c3sin(omega3.t)+d3cos(omega3.t)+c4sin(omega4.t)+d4cos(omega4.t)
# In my data 'file' I have two variable 'co2obs' and 'time'
# SO above function in R will look like EITHER as:

testfit-lm(co2obs~1+time+(time^2)+sin(2*pi*time)+cos(2*pi*time)+sin(4*pi*time)+cos(4*pi*time)+sin(6*pi*time)+cos(6*pi*time)+sin(8*pi*time)+cos(8*pi*time),data=file)
#RESIDUALS COMPUTING for above data fit
testfit$residuals

OR as:

testfit-lm(co2obs~1+time+I(time^2)+sin(2*pi*time)+cos(2*pi*time)+sin(4*pi*time)+cos(4*pi*time)+sin(6*pi*time)+cos(6*pi*time)+sin(8*pi*time)+cos(8*pi*time),data=file)
 #RESIDUALS COMPUTING for above data fit
testfit$residuals

NOTE: The 'third' term (ct^2) in both above is wrriten differently, rest
terms are same.

Kindly advise which one in above is correct, OR both are wrong ?

Great Thanks,

Regards,
Yogesh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change data frame column names

2009-07-15 Thread Duncan Murdoch


On 7/15/2009 10:35 AM, Tom Liptrot wrote:




Hi R helpers,

I have a data frame and I want to change the column names to names I have held 
in another data frame, but I am having difficulty. All data framnes are large 
so i can't input manually. Below is what i have tried so far:

df-data.frame(a=1:5, b=2:6, d=3:7, e=4:8)
coltitles-data.frame(v1=col number one, v2=col number two, v3=col number three, 
v4=col number four)

##first attempt

names(df)-coltitles
names(df)
[1] 1 1 1 1   ###not what i wanted as I want names(df) to return [1] col number one col 
number two col number three col number four


Not sure if my first reply went out; it had an error in it, because I 
misread what you were trying to do.


You want to assign a character vector as names.  You can set it up like 
that originally using


coltitles - c(col number one, col number two, col number three, 
col number four)


and then your first attempt will work.  If you need to get the names out 
of a dataframe, then use


names(x) - coltitles[1,]

to select the first row (or select some other row if you want) of the 
dataframe to use as names.


Duncan Murdoch



##second attempt

coltitles-as.vector(coltitles, mode=character)  ##trying to convert to a 
character vector after looking at help
is.vector(coltitles)
[1] TRUE
names(df)-coltitles
names(df)
[1] 1 1 1 1   ###again not what I wanted

How can I convert the column names?

Thanks in advance,

Tom

Beyond Hotmail - see what else you can do with Windows Live. Find out more.
_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with merging matrices

2009-07-15 Thread David Huffer

On Wednesday, July 15, 2009 8:28 AM, jurgen claesen wrote:

  ...I'm a relative new user of R and I have a
  problem  with  merging   a   collection   of
  matrices.  All  matrices  in this collection
  have the same dimension (532  rows  and  532
  columns),  but  can  differ  in  the row and
  columns  names.  I'd  like  to  merge  these
  matrices  in  such  a way that the resulting
  matrix   only   contains   unique  row-  and
  column-names and that in  case  of  a  match
  between the row or column names the smallest
  value is retained
 As an example says more:
 
 A1-matrix(c(1:9), ncol=3, byrow=TRUE)
 rownames(A1)-colnames(A1)-c(a,b,c)
 
   a b c
 a 1 2 3
 b 4 5 6
 c 7 8 9
 
 A2-matrix(c(7,1,3,10,2,7,3,1,8),ncol=3,byrow=TRUE)
 rownames(A2)-colnames(A2)-c(a,y,z)
 
a y z
 a  7 1 3
 y 10 2 7
 z  3 1 8
 
 I want something like this to be returned:
 
 a  b  c  y  z
 a   1  2  3  1  3
 b   4  5  6 NA NA
 c   7  8  9 NA NA
 y  10 NA NA  5  6
 z   3 NA NA  8  9
 

Two questions (a) how do you want to decide which of the
elements gets dropped out---like the value 7 in A2 [1,1] above
and (b) what goes in the bottom corner of the new matrix? I
would have guessed i would have seen
   2 7
   1 8
in the corner.  Was that a typo or do you really want to
overwrite the values of the submatrix that *is* in A1?

david


--
 David
 
 -
 David Huffer, Ph.D.   Senior Statistician
 CSOSA/Washington, DC   david.huf...@csosa.gov

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting hourly time-series data loaded from file using plot.ts

2009-07-15 Thread Gabor Grothendieck

Try the zoo package:

Lines - time[sec] , Factor1 , Factor2
00:00:00 01.01.2007 , 0. , 0.176083
01:00:00 01.01.2007 , 0. , 0.176417
11:00:00 10.06.2007 , 0. , 0.148250
12:00:00 10.06.2007 , NA , 0.147000
13:00:00 10.06.2007 , NA , 0.144417

library(zoo)
library(chron)
z - read.zoo(textConnection(Lines), sep = ,, header = TRUE,
format = %H:%M:%S %m.%d.%Y, tz = , strip.white = TRUE)
plot(z)

and read the three vignetttes (pdf documents) that come with it.


On Wed, Jul 15, 2009 at 10:07 AM, Keithkigio...@gmail.com wrote:
 Hello everyone,

 I am just a tyro in R and would like your kindly help for some
 problems which I've been struggling for a while but still in vain.

 I have a time-series file (with some missing value ) which looks like

 time[sec] , Factor1 , Factor2
 00:00:00 01.01.2007 , 0. , 0.176083
 01:00:00 01.01.2007 , 0. , 0.176417
 [ ... ]
 11:00:00 10.06.2007 , 0. , 0.148250
 12:00:00 10.06.2007 , NA , 0.147000
 13:00:00 10.06.2007 , NA , 0.144417
 [ ... ]

 and I would like to do some basic time-series analyses using R. The
 first idea is to plot these time-series events and the main problem
 was the handling of the date/time format in the 1st column. I was
 using the script below to deal with:

 data - 
 read.table(file,header=TRUE,sep=,,colClasses=c(character,numeric,numeric))
 data$time.sec. - as.POSIXct(data$time.sec.,format=%H:%M:%S %d.%m.%Y)
 dataTs - as.ts(data)
 plot.ts(dataTs)

 Then, the plot showed up with 3 subplots in one plot. The 1st is the
 linear line with the x-axis being just the sequence of orders and
 y-axis being wrong numbers which is completely wrong. The 2nd and the
 3rd are correct but the x-axis is still wrong. Does anyone know how to
 plot correct Factor1 and Factor2 with respect to the 1st column time
 format? Or, probably should I use some other packages? Besides, how
 can I plot these two time-series data (Factor1 and Factor2) in two
 separate plots?

 Best regards,
 Keith

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change data frame column names

2009-07-15 Thread Gavin Simpson

On Wed, 2009-07-15 at 14:35 +, Tom Liptrot wrote:
 
 
 
 Hi R helpers,
 
 I have a data frame and I want to change the column names to names I have 
 held in another data frame, but I am having difficulty. All data framnes are 
 large so i can't input manually. Below is what i have tried so far:
 
 df-data.frame(a=1:5, b=2:6, d=3:7, e=4:8)
 coltitles-data.frame(v1=col number one, v2=col number two, v3=col 
 number three, v4=col number four)
 
 ##first attempt
 
 names(df)-coltitles
 names(df)
 [1] 1 1 1 1   ###not what i wanted as I want names(df) to return [1] 
 col number one col number two col number three col number four
 
 
 
 ##second attempt
 
 coltitles-as.vector(coltitles, mode=character)  ##trying to convert to a 
 character vector after looking at help
 is.vector(coltitles)
 [1] TRUE
 names(df)-coltitles
 names(df)
 [1] 1 1 1 1   ###again not what I wanted
 
 How can I convert the column names?

This seems a bit of a strange way to go about things, but if needs must:

 names(df) - unlist(coltitles)
 df
  col number one col number two col number three col number four
1  1  23   4
2  2  34   5
3  3  45   6
4  4  56   7
5  5  67   8
 names(df)
[1] col number one   col number two   col number three col number four

If your data frame with the names in the one (and only) row (why store
this as a one row df and not a character vector??? [1]) then by
unlisting it we get a (or something that can be coerced to) a character
vector of the correct names.

If coltitles contains more rows, then:

names(df) - unlist(coltitles[1,])

might be more appropriate.

HTH

G

[1] Your df storage of names is s inefficient (for this task):
 object.size(names(df)) # after running the above code
312 bytes
 object.size(coltitles)
2728 bytes



 
 Thanks in advance,
 
 Tom
 
 Beyond Hotmail - see what else you can do with Windows Live. Find out more.
 _
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with computing an aggregated score

2009-07-15 Thread Mark Knecht

On Wed, Jul 15, 2009 at 3:37 AM, Hatschjohannes.h...@googlemail.com wrote:

 Hi all,

 I have a problem with computing a new variable as an aggregated score of
 other variables.

 Say, I have 10 variables for 1,000 observations (people). Every variable may
 have values between 0 and 8. What I would like to do is computing the mean
 of the individual top 3 values for every person.

 Exampe: The values for the 10 variables (v1 to v10) for person A, B and C
 are as follows:

 A 0 1 0 2 5 8 3 0 4 0
 B 6 4 3 0 0 0 0 5 0 0
 C 0 0 8 0 0 8 0 0 0 0

 So, I would like to compute a new variable for the mean of the individual
 top 3 values, which would result for person A, B and C in the following:

 A (8+5+4)/3 = 5.67
 B (6+5+4)/3 = 5
 C (8+8+0)/3 = 5.33

 Is there any way to do this?

 Any clues, hints and suggestions are highly appreciated,
 many thanks in advance
 Johannes

Unquestionably the experienced guys are going to give you a better
answer, but as a newbie I worked out a step by step answer in a few
minutes. This allows you to see the steps:

A - c(0, 1, 0, 2, 5, 8, 3, 0, 4, 0)
A[order(as.numeric(A), decreasing=TRUE)]
A[order(as.numeric(A), decreasing=TRUE)][1:3]
sum(A[order(as.numeric(A), decreasing=TRUE)][1:3])
sum(A[order(as.numeric(A), decreasing=TRUE)][1:3])/3

Hope this helps,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change data frame column names

2009-07-15 Thread Don MacQueen


Try

names(df) - as.vector(unlist(coltitles))

This will fail if you have more than one row in your data frame coltitles.

Which is why Duncan Murdoch's second email has a better solution.

Given the difficulty in getting the titles out of the data frame in 
vector form, I would wonder why you're storing them as data in a data 
frame in the first place.


Note also that the default behavior when creating a dataframe is for 
character data to be converted to factors. You can see this with



 class(coltitles$v1)

[1] factor

That's part of what gave you weird results. So I would suggest 
looking at the help pages to figure out how to prevent conversion to 
factor.


If you want names(df) to return col number one etc, then you have 
to create a data frame whose column names are those. You created a 
dataframe (coltitles) whose names were v1, v2, etc.


For example,

 tmp - data.frame(col number one=1, col number 
two='foobar',check.names=FALSE)

 names(tmp)

[1] col number one col number two

But simpler and easier to just do as Duncan suggested, and create a 
vector of names.


-Don

At 2:35 PM + 7/15/09, Tom Liptrot wrote:

Hi R helpers,

I have a data frame and I want to change the column names to names I 
have held in another data frame, but I am having difficulty. All 
data framnes are large so i can't input manually. Below is what i 
have tried so far:


df-data.frame(a=1:5, b=2:6, d=3:7, e=4:8)
coltitles-data.frame(v1=col number one, v2=col number two, 
v3=col number three, v4=col number four)


##first attempt

names(df)-coltitles
names(df)
[1] 1 1 1 1   ###not what i wanted as I want names(df) to 
return [1] col number one col number two col number three col 
number four




##second attempt

coltitles-as.vector(coltitles, mode=character)  ##trying to 
convert to a character vector after looking at help

is.vector(coltitles)
[1] TRUE
names(df)-coltitles
names(df)
[1] 1 1 1 1   ###again not what I wanted

How can I convert the column names?

Thanks in advance,

Tom

Beyond Hotmail - see what else you can do with Windows Live. Find out more.
_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question related to merging zoo objects and apply function

2009-07-15 Thread Gabor Grothendieck

Please read the last line of every message to r-help.
In particular simplify this as much as possible and
construct some small artificial test data to illustrate.

Anyways, func is probably not what you want.
It has the same effect as

func - function(x, j) x[72+j] + [144+j]

On Wed, Jul 15, 2009 at 9:26 AM, Sergey Goriatchevserg...@gmail.com wrote:
 Hello everyone,

 Say I have defined a convenience function in my code like this:

 func - function(x, j){
        x[168+j] - x[72+j]+x[144+j]
 }

 And now I apply it to some columns in a big zoo object like this:


 for (m in 1:24){
        combined - merge(combined, LA1sum=apply(combined, 1, func, j=m))
 }

 output of this for-loop will be zoo object with columns named
 LA1sum.1, LA1sum.2, ..., LA1.sum24.

 If I have a vector of names like this:

 namesVec - c(LA1sum, LP1sum, GC1sum, LL1sum, LN1sum,
 SI1sum, LX1sum, CO1sum, CL1sum, QS1sum, HO1sum,
 NG1sum, XB1sum, C.1sum, FC1sum, LH1sum, LC1sum, S.1sum,
 W.1sum, KW1sum, CC1sum, KC1sum, CT1sum, SB1sum)

 What I want is in merge() for each m the name of new column to be one
 of the elements in namesVec, that is
 for m=1 I have LA1sum=apply(...,j=1)
 for m=2 I have LP1sum=apply(...,j=2)
 etc

 What it the way to do this?

 Thank you in advance for your help!

 Regards,
 Sergey

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question related to merging zoo objects and apply function

2009-07-15 Thread Sergey Goriatchev

Hello, Gabor

Generally, I follow the r-help rules and provide working code snippets
to illustrate the problem. In this case it was more methodological
question of how to loop names in merge() function.
I solved this very simply buy renaming specific columns after I have
ran merge().

Thank you for your time!

Best,
Sergey

On Wed, Jul 15, 2009 at 17:18, Gabor
Grothendieckggrothendi...@gmail.com wrote:
 Please read the last line of every message to r-help.
 In particular simplify this as much as possible and
 construct some small artificial test data to illustrate.

 Anyways, func is probably not what you want.
 It has the same effect as

 func - function(x, j) x[72+j] + [144+j]

 On Wed, Jul 15, 2009 at 9:26 AM, Sergey Goriatchevserg...@gmail.com wrote:
 Hello everyone,

 Say I have defined a convenience function in my code like this:

 func - function(x, j){
        x[168+j] - x[72+j]+x[144+j]
 }

 And now I apply it to some columns in a big zoo object like this:


 for (m in 1:24){
        combined - merge(combined, LA1sum=apply(combined, 1, func, j=m))
 }

 output of this for-loop will be zoo object with columns named
 LA1sum.1, LA1sum.2, ..., LA1.sum24.

 If I have a vector of names like this:

 namesVec - c(LA1sum, LP1sum, GC1sum, LL1sum, LN1sum,
 SI1sum, LX1sum, CO1sum, CL1sum, QS1sum, HO1sum,
 NG1sum, XB1sum, C.1sum, FC1sum, LH1sum, LC1sum, S.1sum,
 W.1sum, KW1sum, CC1sum, KC1sum, CT1sum, SB1sum)

 What I want is in merge() for each m the name of new column to be one
 of the elements in namesVec, that is
 for m=1 I have LA1sum=apply(...,j=1)
 for m=2 I have LP1sum=apply(...,j=2)
 etc

 What it the way to do this?

 Thank you in advance for your help!

 Regards,
 Sergey

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
I'm not young enough to know everything. /Oscar Wilde
Experience is one thing you can't get for nothing. /Oscar Wilde
When you are finished changing, you're finished. /Benjamin Franklin
Tell me and I forget, teach me and I remember, involve me and I learn.
/Benjamin Franklin
Luck is where preparation meets opportunity. /George Patten

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Grouping data in dataframe

2009-07-15 Thread Timo Schneider

Am Mittwoch, den 15.07.2009, 00:42 -0500 schrieb markle...@verizon.net:

Hi!

 Hi: I think aggregate does what you want. you had 34 in one of your
 columns but I think you meant it to be 33.
 
 DF - read.table(textConnection(ExpA ExpB ExpC Size
 1 12 23 33 1
 2 12 24 29 1
 3 10 22 34 1
 4 25 50 60 2
 5 24 53 62 2
 6 21 49 61 2),header=TRUE)
 
 print(DF)
 print(str(DF))
 
 aggregate(DF,list(DF$Size),median)

Yes, thanks to you and all the other people who helped! The aggregate
function is exactly what I was looking for. Thanks for the help!

Regards,
Timo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] predictive punishment module (was: Re: Simple cat statem

2009-07-15 Thread Ted Harding

On 15-Jul-09 14:45:26, S Ellison wrote:
 
 Duncan Murdoch murd...@stats.uwo.ca 15/07/2009 15:04:29 
 * R has a new predictive punishment module.  It punishes you for
 things it knows you will do later.
 Dang! Does that mean it'll return errors for the statistical idiocy
 I haven't yet got around to committing? 
 
 ;-)
 
 Steve E

It is even more twisted and sadistic than that.

Being an Expert System on Human Psychology (how else could it succeed
so well in putting the user on the wrong foot), it knows that the user
will interpret the punishment as being related to the correct thing
that the user just did, even though the punishment is for the thing
that will be done later. Users will therefore develop an inhibition
in respect of correct actions, and will be driven into incorrect
behaviour.

The incorrect thing that will be done later will be punished anyway.
Since the complement to {an incorrect thing} is
  {correct thing} union {all the other incorrect things}
the user will end up
a) Confused
b) Increasingly likely to do incorrect things, therefore increasingly
   likely to be punished, finally being punished on almost all
   occasions of action (predictively or not).

So, now we know ... :(
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 15-Jul-09   Time: 16:21:53
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with computing an aggregated score

2009-07-15 Thread Jorge Ivan Velez

Dear Johannes,
If 'x' is your data set, here is one way using apply:

res - apply(x[,-1], 1, function(x){
  xo - sort(x, decreasing = TRUE)
  mean(xo[1:3])
  }
   )
names(res) - x[,1]
res
#   ABC
#  5.67 5.00 5.33

See ?apply, ?sort and ?mean for more information.

HTH,

Jorge


On Wed, Jul 15, 2009 at 6:37 AM, Hatsch johannes.h...@googlemail.comwrote:


 Hi all,

 I have a problem with computing a new variable as an aggregated score of
 other variables.

 Say, I have 10 variables for 1,000 observations (people). Every variable
 may
 have values between 0 and 8. What I would like to do is computing the mean
 of the individual top 3 values for every person.

 Exampe: The values for the 10 variables (v1 to v10) for person A, B and C
 are as follows:

 A 0 1 0 2 5 8 3 0 4 0
 B 6 4 3 0 0 0 0 5 0 0
 C 0 0 8 0 0 8 0 0 0 0

 So, I would like to compute a new variable for the mean of the individual
 top 3 values, which would result for person A, B and C in the following:

 A (8+5+4)/3 = 5.67
 B (6+5+4)/3 = 5
 C (8+8+0)/3 = 5.33

 Is there any way to do this?

 Any clues, hints and suggestions are highly appreciated,
 many thanks in advance
 Johannes
 --
 View this message in context:
 http://www.nabble.com/Problems-with-computing-an-aggregated-score-tp24495390p24495390.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] negative Somers D from Design package

2009-07-15 Thread North, Bernard V

Dear R help

My problem is very similar to the analysis detailed here.
If we use the mayo dataset provided with the survivalROC package the estimate 
for Somer's Dxy is very negative -0.56.
The Nagelkerke R2 is positive though 0.32.
I know there is a difference between explained variation and predictive ability 
but I am surprised there is usch a difference given that even a non predictive 
model should have Dxy around 0.
Am I doing something wrong or is there an interpretation that makes sense ?

This is with the mayo data so its reproducible but the result with my data is 
very similar.
Many thanks in advance

library(survivalROC)
library(Design)
library(survival)
data(mayo)

 Sm - Surv(mayo$time,mayo$censor)
fm - cph( Sm ~ mayoscore4,mayo,x=T,y=T,surv=T )
validate(fm, B=150,dxy=T)
Iteration 1 

index.orig training test  optimism index.corrected   n
Dxy   -0.566027923 -0.55407 -0.566027923 -0.0006374833-0.565390440 150
R2 0.325860603  0.327350885  0.325860603  0.0014902826 0.324370320 150
Slope  1.0  1.0  0.987854765  0.0121452354 0.987854765 150
D  0.093398440  0.095166239  0.093398440  0.0017677983 0.091630642 150
U -0.001562582 -0.001579618  0.001150175 -0.0027297932 0.001167211 150
Q  0.094961022  0.096745857  0.092248266  0.0044975915 0.090463431 150



Dr Bernard North
Statistical Consultant
Statistical Advisory Service
Advice and Courses on Research Design and Methodology
Imperial College
South Kensington Campus
Room 845, 4th Floor
8 Princes Gardens
London SW7 1NA.

Tel: 020 7594 2034
Fax: 020 7594 1489
Email: bno...@imperial.ac.uk
Web:  www.ic.ac.uk/stathelp




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] For loop for distinguishing negative numbers

2009-07-15 Thread cmga20


Hi i am very new to R and I have been trying to change each individual piece
of data in a data set to 10 if it is below 0 and 5 if it is above 0. I know
this sounds very easy but i am struggling!!
-- 
View this message in context: 
http://www.nabble.com/For-loop-for-distinguishing-negative-numbers-tp24499872p24499872.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nls, reach limit bounds

2009-07-15 Thread Bert Gunter

... and getting an answer is no assurance that the answer is meaningful. In
cases like this (which arise frequently because of insistence on using the
accepted mechanistic model paradigm even in the absence of informative data
to estimate it), the confidence intervals for (correlated) parameters will
usually be so wide as to be useless. That is, for practical purposes, the
model is nonidentifiable. This can easily be seen by making small random
perturbations in the data and watching how the parameter estimates change --
i.e. performing a sensitivity analysis. Incidentally, the predictions will
typically be fine, so the standard scientific practice of graphing the data
with an overlaid smooth curve as a check on the validity of the estimated
parameters is nonsense. 

One should not get so lost among the trees of statistical niceties that one
loses sight of the scientific forest: the model can't be adequately fit by
the data. Either get more data or choose a more appropriate model.

Cheers,

Bert Gunter
Genentech Nonclinical Biostatistics


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Ravi Varadhan
Sent: Tuesday, July 14, 2009 4:35 PM
To: 'UyenThao Nguyen'; 'spencerg'
Cc: r-help@r-project.org
Subject: Re: [R] nls, reach limit bounds

I took a quick look at drcpackage and the drm function.  The drm()
function uses optim (BFGS method).  So, that is one diffference.  However,
without looking at your code on how you used drm(), I cannot tell further.  

The fact that you got an answer using optim() does not necessarily mean that
everything is swell.  Did you check the Hessian to see if it is
positive-definite?

You might also want to try the nls.lm() function in the minpack.lm
package.

Ravi.



---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: UyenThao Nguyen [mailto:ungu...@tethysbio.com] 
Sent: Tuesday, July 14, 2009 7:07 PM
To: Ravi Varadhan; 'spencerg'
Cc: r-help@r-project.org
Subject: RE: [R] nls, reach limit bounds

Hi Ravi and Spencer,

Thank you very much for your help.

I did plot the data, and saw that the data didn't seem to have an inflection
point. Yes, the data contained 6 points of duplicates, which the 4 P
logistic regression is appropriate to use. 

I tried the dose response model (drm in drc package), this model works
without any problem. Do you know if the drm has different tolerance in
convergent or something else? Let's say if I choose drm to fit the data, Can
I get the parameters in the same way nls gives me? I tested a converged data
set on both drm and nls, and I can't see any relationship between their
parameters although the fits were similar.

Thank you.
Uyen

-Original Message-
From: Ravi Varadhan [mailto:rvarad...@jhmi.edu]
Sent: Monday, July 13, 2009 3:32 PM
To: 'spencerg'; UyenThao Nguyen
Cc: r-help@r-project.org
Subject: RE: [R] nls, reach limit bounds

Hi Uyen, 

You do not have enough information to estimate 4 parameters in your
nonlinear model.  Even though you have 12 data points, only 6 are
contributing independent information (you essentially have 6 replicate
points).  If you plot the fittted function you will see that it fits your
data really well.  However, you will also see that the fitted curve is far
from reaching the asymptote.  An implication of this is that you cannot
reliably estimate b0 and b1 separately.  So, you need more data, especially
for larger x values in the asymptotic region.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of spencerg
Sent: Saturday, July 11, 2009 9:50 PM
To: UyenThao Nguyen
Cc: r-help@r-project.org
Subject: Re: [R] nls, reach limit bounds

  Have you plotted the data?  There are two standard sources of
non-convergence problems like this:  First, there may not be enough
information in your data to estimate all four parameters.  Second, if that's
not the case, then  your starting values are not appropriate. 


  I routinely use optim or nlminb to find

Re: [R] For loop for distinguishing negative numbers

2009-07-15 Thread tonybreyal


see ?ifelse

you didn't specify what happens if a value is exactly zero in the dataset
and so i've just bundled it in with the negative case:

x - rnorm(20, 0, 1)
y-ifelse(x=0, 10, 5)

HTH,
Tony Breyal 


cmga20 wrote:
 
 Hi i am very new to R and I have been trying to change each individual
 piece of data in a data set to 10 if it is below 0 and 5 if it is above 0.
 I know this sounds very easy but i am struggling!!
 

-- 
View this message in context: 
http://www.nabble.com/For-loop-for-distinguishing-negative-numbers-tp24499872p24500973.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simulation functions for underdispered Poisson and binomial distributions

2009-07-15 Thread Charles C. Berry


On Wed, 15 Jul 2009, Shinichi Nakagawa wrote:


Dear R users

I would like to simulate underdispersed Poisson and binomial distributions 
somehow.


I know you can do this for overdispersed counterparts - using rnbinom() for 
Poisson and rbetabinom() for binomial.


Could anyone share functions to do this? Or please share some tips for 
modifying existing functions to achieve this.



Shinichi,

You really need a model for the underdispersion. Using that model, you 
would calculate the probabiltities fo the binomial or Poisson counts.


But you have to come up with something appropriate for your situation.

For example,

probs - prop.table( dbinom( 0:10, 10, .5)^3 )

or

probs - prop.table( dbinom( 0:10, 10, .5) +
ifelse( 0:10 == 5, 1, 0) )


will each produce probabilities for counts that are less dispersed than

probs - dbinom( 0:10, 10, 0.5 )

but neither may suitably model the counts for the situation in which you 
are interested.


---

Once you have that model in hand

sample( 0:10, N, pr=probs, repl=TRUE )

will 'simulate' N such counts.

HTH,

Chuck




Thank you very much for your help and time

Shinichi

Shinichi Nakagawa, PhD
(Lecturer of Behavioural Ecology)
Department of Zoology
University of Otago
340 Great King Street
P. O. Box 56
Dunedin, New Zealand
Tel:  +64-3-479-5046
Fax: +64-3-479-7584
http://www.otago.ac.nz/zoology/staff/academic/nakagawa.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix multiplication precision

2009-07-15 Thread Nair, Murlidharan T

Thanks for the explanation. I am not too deep into linear algebra. 

T.mat, Rz.mat and Q.mat are square matrices 4 x 4. 

To be more clear, here is what they look like. 
 
h=3.39/2;
T.mat-matrix(c(1,0,0,0,0,1,0,0,0,0,1,0,0,0,-h,1), nrow=4) # translation of the 
system
alpha-36*pi/180
cos.alpha-cos(alpha)
minus.sin.alpha- -1*sin(alpha)
sin.alpha-sin(alpha)
Rz.mat-matrix(c(cos.alpha,minus.sin.alpha,0,0,sin.alpha,cos.alpha,0,0,0,0,1,0,0,0,0,1),
 nrow=4)
Rx.mat-matrix(c(1,0,0,0,0,cos.alpha, 
minus.sin.alpha,0,0,sin.alpha,cos.alpha,0,0,0,0,1), nrow=4)

Q.mat is a product of Rz.mat and Rx.mat with different angles. 

The resultant matrix Mn.mat is used to computed the next coordinates of a 
helix. 

center.Of.bases-matrix(c(0.0,0.0,0.0,1),nrow=4)

Mn.mat-qr.solve(compute.Mn.mat(T.mat,Rz.mat,Q.mat)) # This function computes 
the Mn.mat matrix using specific angles.  Which is used to get the new 
coordinates.

New.center.Of.bases-as.numeric(Mn.mat %*% center.Of.bases)

Does these lines of code tell you anything about the complexity of the problem? 
 Let me know if I should do anything different. I was really glad hear from you 
since you are an expert in the area. 

Cheers../Murli
 





-Original Message-
From: dmba...@gmail.com [mailto:dmba...@gmail.com] On Behalf Of Douglas Bates
Sent: Wednesday, July 15, 2009 7:29 AM
To: Nair, Murlidharan T
Cc: r-help@r-project.org
Subject: Re: [R] Matrix multiplication precision

On Wed, Jul 15, 2009 at 3:42 AM, Nair, Murlidharan Tmn...@iusb.edu wrote:
 Hi!!

 I am trying to multiply 5 matrices and then using the inverse of that matrix 
 for further computation. I read about precision problems from the archives 
 and the suggestion was to use as.numeric while computing the products.

Hmm.  I'm not sure what the origins of that advice might be but I
don't think it would apply in general.

 I am still having problems with the results. Here is how I am using it

 #Mn.mat-(T.mat %*% Rz.mat %*% Q.mat %*% Rz.mat %*% T.mat)   # I was doing 
 this in one step which I have broken down into multiple steps as below.

 Mn1.mat-matrix(as.numeric(T.mat %*% Rz.mat),nrow=4)
 Mn2.mat-matrix(as.numeric(Mn1.mat %*% Q.mat),nrow=4)
 Mn3.mat-matrix(as.numeric(Mn2.mat %*% Rz.mat),nrow=4)
 Mn.mat-matrix(as.numeric( Mn3.mat %*% T.mat),nrow=4)

I doubt very much that the as.numeric would have any effect on
precision in these cases.  You are simply taking a numeric result in
its matrix form, converting it to a vector then converting the vector
back to a matrix.,

 mm - matrix(round(rnorm(8), 3), nrow = 4)
 mm
   [,1]  [,2]
[1,] -0.110 2.195
[2,]  0.616 0.353
[3,]  0.589 0.970
[4,]  1.028 0.857
 as.numeric(mm)
[1] -0.110  0.616  0.589  1.028  2.195  0.353  0.970  0.857

The key in situations like this is to analyze the structure of the
computation and decide whether you can group the intermediate results.
 You have written your product as

T %*% Rz %*% Q %*% Rz %*% T

What are the characteristics of T, Rz, and Q?  Are they square and
symmetric, square and triangular, diagonal?  Is Q orthogonal (the
factors in an orthogonal - triangular decomposition are often written
as Q and R)?  Did you happen to skip a few transposes in that formula
- often formulas like that generate symmetric matrices.

And the big lesson, of course, is the first rule of numerical linear
algebra., Just because a formula is written in terms of the inverse
of a matrix doesn't mean that is a good way to calculate the result;
in fact, it is almost inevitably the worst way of calculating the
result.  You only calculate the inverse of a matrix after you have
investigated all other possible avenues for calculating the result and
found them to be fruitless.

You haven't told us what you plan to do with the inverse and that is
an important consideration.,  If, for example, it represents a
variance-covariance matrix, and you want to express the result as
standard deviations and correlations you would not compute the
variance-covariance matrix but stop instead at the Cholesky factor of
the inverse.

You may want to check the facilities of the Matrix package for
expressing your calculation.  It is a recommended package that is
included with binary versions of R from 2.9.0 and has many ways of
expressing and exploiting structure in matrices.

 For getting the inverse I am doing the following

 Mn.mat.inv-qr.solve(Mn.mat)


 Will I run into precision problems when I do the above?

 Thanks ../Murli

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible

Re: [R] (newbie) sum for certain number of rows

2009-07-15 Thread kelvin lau


Thanks. I tried it and it worked wonderful.


Wishing for the DAY' to come. Life needs to be 'reset'.


--- On Wed, 7/15/09, Dimitris Rizopoulos d.rizopou...@erasmusmc.nl wrote:

 From: Dimitris Rizopoulos d.rizopou...@erasmusmc.nl
 Subject: Re: [R] (newbie) sum for certain number of rows
 To: kelvin lau kelvin...@yahoo.com
 Cc: r-help@r-project.org
 Date: Wednesday, July 15, 2009, 6:25 PM
 one way is the following:
 
 dat - read.table(textConnection(
 0 0 1 0 0 1 0 1
 0 0 0 0 0 0 0 0
 1 0 0 1 1 0 1 0
 0 0 1 1 0 0 0 0
 1 1 0 0 0 0 1 1
 0 0 0 0 0 0 0 0
 0 1 0 1 1 0 1 0
 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1
 ))
 closeAllConnections()
 
 k - 3
 ind - rep(seq(1, nrow(dat)/k), each = k)
 rowsum(dat, ind)
 
 
 I hope it helps.
 
 Best,
 Dimitris
 
 
 kelvin lau wrote:
  I have following data in a data.csv file separated by
 space
  
  0 0 1 0 0 1 0 1
  0 0 0 0 0 0 0 0
  1 0 0 1 1 0 1 0
  0 0 1 1 0 0 0 0
  1 1 0 0 0 0 1 1
  0 0 0 0 0 0 0 0
  0 1 0 1 1 0 1 0
  1 1 1 1 1 1 1 1
  1 1 1 1 1 1 1 1
  etc...
  
  I wish to calculate the sum of each column for certain
 number of rows. For example if I want sum of the data after
 each 3 rows, it should display
  1 0 1 1 1 1 1 1
  1 1 1 1 0 0 1 1
  2 3 2 3 3 2 3 2
  
  So far, this is what I have done
  xx-read.table(data.csv,header=FALSE)
  ss-t(apply(xx,2,sum)) # which displayed the sum of
 all rows
  
  I tried my best to look for solution on the Internet
 but so far haven't managed to find it. I am extremely
 grateful if someone can point me how to go about it.
 Thanks.
  
  Kelvin
  
  __
  R-help@r-project.org
 mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
  
 
 -- Dimitris Rizopoulos
 Assistant Professor
 Department of Biostatistics
 Erasmus University Medical Center
 
 Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
 Tel: +31/(0)10/7043478
 Fax: +31/(0)10/7043014


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.delim skips first column (why?)

2009-07-15 Thread Paolo Sonego


This should work:


junk -read.table(fd0edfab.txt, sep=, header=T, fill=F,quote= )

HIH

Paolo Sonego

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Passing additional arguments through '...'

2009-07-15 Thread escher2079


Hi,

I know this is a simple question, but I've been having problems passing
additional arguments through '...'. It is not matching the arguments
correctly if the permanent argument of the function begins with the same
letter as the additional argument. The following example will help show what
I mean:

fun.tester - function(abc,...){
+ print(abc)
+ }

But if I input:
fun.tester(0,a=1)

It returns the value '1' for abc. It does however, work properly if I input:
fun.tester(abc=0,a=1)

It seems like a simple problem, so I would assume I'm doing something
stupid, but I haven't been able to find a solution anywhere. Thanks!
-- 
View this message in context: 
http://www.nabble.com/Passing-additional-arguments-through-%27...%27-tp24501159p24501159.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] FW: problems in resampling time series, block length, trend.test

2009-07-15 Thread Simone Santoro







Hi,

I have a time series (say x) of 13 years showing an evident increase. I want 
to exclude two observations (the fourth and 10th), so I do:

 trend.test(x[-c(4,10)])

where:

 x[-c(4,10)]
 [1]7   37   79   72  197  385  636  705  700 1500 1900

and I get:

Spearman's rank correlation rho

data:  x[-c(4, 10)] and time(x[-c(4, 10)]) 
S = 4, p-value  2.2e-16
alternative hypothesis: true rho is not equal to 0 
sample estimates:
  rho 
0.9818182 

Very strong positive correlation! the x is increasing

Now, I would like to resample this time series because I have others time 
series where the trend is more uncertain and the sample size is still small.
I read Resampling Methods in R: The boot Package (ISSN 1609-3631) and I 
believe that a way of doing it is by a block bootstrap that allows to deal with 
the autocorrelation (I know that some of my time series show autocorrelation, 
lag=1).

I used trend.test function (library=pastecs) and I did:

trend.x=trend.test(x[-c(4,10)],R=999)
trend.x
trend.x$p.value
plot(trend.x)
 
And I get:

 trend.x=trend.test(x[-c(4,10)],R=999)
 trend.x

BLOCK BOOTSTRAP FOR TIME SERIES

Fixed Block Length of 1 

Call:
tsboot(tseries = x, statistic = test.trend, R = R, l = 1, sim = fixed)


Bootstrap Statistics :
 original biasstd. error
t1* 0.9818182 -0.9844001   0.3272114
 trend.x$p.value
[1] 0

I suppose that the problem arises from the length of the block (1) and in this 
way I get a rho=0, (nevertheless I don't understand how it is significant).
I would like to change the block length but I am not able to (I tried in 
several different ways but unsuccessfully).
So, two questions:
1) How can I change the block length?
2) In terms of block length and type of simulation (sim=fixed or geom), 
what is the best way of doing it? 


Thanks in advance for any suggestion,
Best wishes

Simone

¿Quieres conocerte mejor? ¡Conoce lo que Windows Live tiene especialmente para 
ti!
_
[[elided Hotmail spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bug in package.skeleton, R 2.9.0?

2009-07-15 Thread ml-r-help

Daniel Klevebring wrote:
 Bump
 
 Anyone?
 
 Thanks
 Daniel
 
 On 13 jul 2009, at 10.57, Daniel Klevebring wrote:
 
 Dear all,

 I am using package.skeleton to build a small packages of misc function
 for personal use. I have recently discovered that the option
 force=TRUE doesn't seem to do what is meant to do. Here's what I'm
 doing:

 setwd(/Users/danielk/Documents/R/packages/dk)
 files - paste(codebase, dir(codebase, pattern=.R), sep=/)
 package.skeleton(name=dk, force=TRUE, code_files=files)
 Creating directories ...
 Creating DESCRIPTION ...
 Creating Read-and-delete-me ...
 Copying code files ...
 Making help files ...
 Done.
 Further steps are described in './dk/Read-and-delete-me'.
 Now, everything seems fine, but changes to files in me codebase
 folder, doesn't come along if the folder dk/R already contains the
 files, even though I use force=TRUE. If I remove the dk/R folder or
 the dk folder altogether, the changes come along so to me it seems
 that it's the overwrite part that doesn't work as it should - or am I
 doing something wrong here? To me, it seems that the function
 safe.dir.create (which is defined in package.skeleton never overwrites
 folders, yielding force=TRUE useless.

from the help on package.skeleton

force: If 'FALSE' will not overwrite an existing directory.

which could be clearer in that package.skeleton will stop with an error if the 
top level
directory, i.e. 'dk' in your case, exists and you had specified force=FALSE. 
But it will
not overwrite the existing directory 'dk' with force=TRUE as you observed.

Looking at the code it is clear that file.copy is used to copy the code files 
and that has
per default overwrite=FALSE so your existing code files will not be overwritten.
With the current implementation package.skeleton will not do what you intended.

There are many ways to workaround
e.g.
fn - dir(codebase, pattern=.R, full.names=TRUE)
file.remove( list.files(path=file.path(dk, R), pattern=\\.R, 
full.names=TRUE))
package.skeleton(name=dk, force=TRUE, code_files=fn)


Matthias

 See below for sessionInfo.

 Thanks a bunch
 Daniel



 sessionInfo()
 R version 2.9.0 (2009-04-17)
 i386-apple-darwin8.11.1

 locale:
 en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 loaded via a namespace (and not attached):
 [1] tools_2.9.0

 --

 Contact information:

 Daniel Klevebring
 M. Sc. Eng., Ph.D. Student
 Dept of Gene Technology
 Royal Institute of Technology, KTH
 SE-106 91 Stockholm, Sweden

 Visiting address: Roslagstullsbacken 21, B3
 Delivery address: Roslagsvägen 30B, 104 06, Stockholm
 Invoice address: KTH Fakturaserice, Ref DAKL KTHBIO, Box 24075,
 SE-10450 Stockholm
 E-mail: dan...@biotech.kth.se
 Phone: +46 8 5537 8337 (Office)
 Phone: +46 704 71 65 91 (Mobile)
 Web: http://www.biotech.kth.se/genetech/index.html
 Fax: +46 8 5537 8481

 --

 Contact information:

 Daniel Klevebring
 M. Sc. Eng., Ph.D. Student
 Dept of Gene Technology
 Royal Institute of Technology, KTH
 SE-106 91 Stockholm, Sweden

 Visiting address: Roslagstullsbacken 21, B3
 Delivery address: Roslagsvägen 30B, 104 06, Stockholm
 Invoice address: KTH Fakturaserice, Ref DAKL KTHBIO, Box 24075,
 SE-10450 Stockholm
 E-mail: dan...@biotech.kth.se
 Phone: +46 8 5537 8337 (Office)
 Phone: +46 704 71 65 91 (Mobile)
 Web: http://www.biotech.kth.se/genetech/index.html
 Fax: +46 8 5537 8481
 MSN messenger: klevebr...@msn.com


  [[alternative HTML version deleted]]

 ATT1.txt
 
 --
 
 Contact information:
 
 Daniel Klevebring
 M. Sc. Eng., Ph.D. Student
 Dept of Gene Technology
 Royal Institute of Technology, KTH
 SE-106 91 Stockholm, Sweden
 
 Visiting address: Roslagstullsbacken 21, B3
 Delivery address: Roslagsvägen 30B, 104 06, Stockholm
 Invoice address: KTH Fakturaserice, Ref DAKL KTHBIO, Box 24075,  
 SE-10450 Stockholm
 E-mail: dan...@biotech.kth.se
 Phone: +46 8 5537 8337 (Office)
 Phone: +46 704 71 65 91 (Mobile)
 Web: http://www.biotech.kth.se/genetech/index.html
 Fax: +46 8 5537 8481
 MSN messenger: klevebr...@msn.com
 
 
   [[alternative HTML version deleted]]
 
 
 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Matthias Burger Project Manager/ Biostatistician
Epigenomics AGKleine Praesidentenstr. 110178 Berlin, Germany
phone:+49-30-24345-0fax:+49-30-24345-555
http://www.epigenomics.com   matthias.bur...@epigenomics.com
--
Epigenomics AG Berlin   Amtsgericht Charlottenburg HRB 75861
Vorstand:   Geert Nygaard (CEO/Vorsitzender)
Oliver Schacht

Re: [R] Passing additional arguments through '...'

2009-07-15 Thread Bert Gunter

Please consult the R Language Definition for a detailed explantion, but...

In brief, the evaluator first tries to match formal arguments by name, first
exactly, then partially, before matching by position, so a partially
matches formal argument abc.

e.g. contrast 
 fun.tester(0,b=1) ## b does not partially match abc
[1] 0

Bert Gunter
Genentech Nonclinical Biostatistics

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of escher2079
Sent: Wednesday, July 15, 2009 9:06 AM
To: r-help@r-project.org
Subject: [R] Passing additional arguments through '...'


Hi,

I know this is a simple question, but I've been having problems passing
additional arguments through '...'. It is not matching the arguments
correctly if the permanent argument of the function begins with the same
letter as the additional argument. The following example will help show what
I mean:

fun.tester - function(abc,...){
+ print(abc)
+ }

But if I input:
fun.tester(0,a=1)

It returns the value '1' for abc. It does however, work properly if I input:
fun.tester(abc=0,a=1)

It seems like a simple problem, so I would assume I'm doing something
stupid, but I haven't been able to find a solution anywhere. Thanks!
-- 
View this message in context:
http://www.nabble.com/Passing-additional-arguments-through-%27...%27-tp24501
159p24501159.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is it possible to use EGARCH and GJR in R?

2009-07-15 Thread Liviu Andronic

Hello,

On 7/15/09, Andriy Fetsun fet...@googlemail.com wrote:
   Could you please help me with EGARCH and GJR?

Do you mean JGR [1]? If so, install it and try running your code in
it. Unless you bump into very specific issues, it should work.
Liviu

[1] http://cran.r-project.org/web/packages/JGR/index.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change data frame column names

2009-07-15 Thread Tom Liptrot


Thanks all,

I used Gavins approach - unlisting the titles and the replacing names, as my 
titles were stored as factors in the data frame as that was the way they were 
imported...

Tom


 Subject: Re: [R] Change data frame column names
 From: gavin.simp...@ucl.ac.uk
 To: tomlipt...@hotmail.com
 CC: r-help@r-project.org
 Date: Wed, 15 Jul 2009 16:12:32 +0100
 
 On Wed, 2009-07-15 at 14:35 +, Tom Liptrot wrote:
  
  
  
  Hi R helpers,
  
  I have a data frame and I want to change the column names to names I have 
  held in another data frame, but I am having difficulty. All data framnes 
  are large so i can't input manually. Below is what i have tried so far:
  
  df-data.frame(a=1:5, b=2:6, d=3:7, e=4:8)
  coltitles-data.frame(v1=col number one, v2=col number two, v3=col 
  number three, v4=col number four)
  
  ##first attempt
  
  names(df)-coltitles
  names(df)
  [1] 1 1 1 1   ###not what i wanted as I want names(df) to return 
  [1] col number one col number two col number three col number four
  
  
  
  ##second attempt
  
  coltitles-as.vector(coltitles, mode=character)  ##trying to convert to a 
  character vector after looking at help
  is.vector(coltitles)
  [1] TRUE
  names(df)-coltitles
  names(df)
  [1] 1 1 1 1   ###again not what I wanted
  
  How can I convert the column names?
 
 This seems a bit of a strange way to go about things, but if needs must:
 
  names(df) - unlist(coltitles)
  df
   col number one col number two col number three col number four
 1  1  23   4
 2  2  34   5
 3  3  45   6
 4  4  56   7
 5  5  67   8
  names(df)
 [1] col number one   col number two   col number three col number four
 
 If your data frame with the names in the one (and only) row (why store
 this as a one row df and not a character vector??? [1]) then by
 unlisting it we get a (or something that can be coerced to) a character
 vector of the correct names.
 
 If coltitles contains more rows, then:
 
 names(df) - unlist(coltitles[1,])
 
 might be more appropriate.
 
 HTH
 
 G
 
 [1] Your df storage of names is s inefficient (for this task):
  object.size(names(df)) # after running the above code
 312 bytes
  object.size(coltitles)
 2728 bytes
 
 
 
  
  Thanks in advance,
  
  Tom
  
  Beyond Hotmail - see what else you can do with Windows Live. Find out more.
  _
  
  
  [[alternative HTML version deleted]]
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 -- 
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
  Dr. Gavin Simpson [t] +44 (0)20 7679 0522
  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
  Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 

_
[[elided Hotmail spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] negative Somers D from Design package

2009-07-15 Thread Frank E Harrell Jr

For the Cox model Dxy is the rank correlation between predicted log 
hazard and time to event.  As a high hazard means that the time to event 
is short, you need to negate Dxy for your purpose.


Frank


North, Bernard V wrote:

Dear R help

My problem is very similar to the analysis detailed here.
If we use the mayo dataset provided with the survivalROC package the estimate 
for Somer's Dxy is very negative -0.56.
The Nagelkerke R2 is positive though 0.32.
I know there is a difference between explained variation and predictive ability 
but I am surprised there is usch a difference given that even a non predictive 
model should have Dxy around 0.
Am I doing something wrong or is there an interpretation that makes sense ?

This is with the mayo data so its reproducible but the result with my data is 
very similar.
Many thanks in advance

library(survivalROC)
library(Design)
library(survival)
data(mayo)

 Sm - Surv(mayo$time,mayo$censor)
fm - cph( Sm ~ mayoscore4,mayo,x=T,y=T,surv=T )
validate(fm, B=150,dxy=T)
Iteration 1 

index.orig training test  optimism index.corrected   n
Dxy   -0.566027923 -0.55407 -0.566027923 -0.0006374833-0.565390440 150
R2 0.325860603  0.327350885  0.325860603  0.0014902826 0.324370320 150
Slope  1.0  1.0  0.987854765  0.0121452354 0.987854765 150
D  0.093398440  0.095166239  0.093398440  0.0017677983 0.091630642 150
U -0.001562582 -0.001579618  0.001150175 -0.0027297932 0.001167211 150
Q  0.094961022  0.096745857  0.092248266  0.0044975915 0.090463431 150



Dr Bernard North
Statistical Consultant
Statistical Advisory Service
Advice and Courses on Research Design and Methodology
Imperial College
South Kensington Campus
Room 845, 4th Floor
8 Princes Gardens
London SW7 1NA.

Tel: 020 7594 2034
Fax: 020 7594 1489
Email: bno...@imperial.ac.uk
Web:  www.ic.ac.uk/stathelp




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Passing additional arguments through '...'

2009-07-15 Thread Charles C. Berry


On Wed, 15 Jul 2009, escher2079 wrote:



Hi,

I know this is a simple question, but I've been having problems passing
additional arguments through '...'. It is not matching the arguments
correctly if the permanent argument of the function begins with the same
letter as the additional argument. The following example will help show what
I mean:

fun.tester - function(abc,...){
+ print(abc)
+ }

But if I input:
fun.tester(0,a=1)

It returns the value '1' for abc. It does however, work properly if I input:
fun.tester(abc=0,a=1)



I think you'll need to dig into sys.call() and match.call() and put 
together your own matching scheme to force a function to first match by 
position and then match all else by name.


If match.call() is unfamiliar to you, it is advised to read the first 10 
lines of lm().


HTH,

Chuck

p.s. every argument that comes AFTER '...' in the formals must match 
exactly. Perhaps this would help you.



It seems like a simple problem, so I would assume I'm doing something
stupid, but I haven't been able to find a solution anywhere. Thanks!
--
View this message in context: 
http://www.nabble.com/Passing-additional-arguments-through-%27...%27-tp24501159p24501159.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] spreadsheet-style autoupdating

2009-07-15 Thread Liviu Andronic

Dear all,
I am curious whether one can automatically update, say, a column in R,
similar to how spreadsheets (Excel, etc.) do?
In my specific case I have two columns with logical data, which
individually convey different information, although ultimately for a
common purpose (flagging for removal). I would thus need a third
logical variable that would take TRUE if any of the prev. columns take
value TRUE for the given line. Without using RExcel, would there be a
way of automatising this (by manual I understand running some code
to recreate the third variable prior to printing it)?

To R purists: I am not sure that this would amount to good programming
habits. Also, this way of thinking comes from my infancy with Excel;
should there be a better approach, please let me know.
Thank you
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with averaging

2009-07-15 Thread Amit Patel


Hi
I am using the following script to average a set of data 0f 62 columns into 31 
colums. The data consists of values of ln(0.01) or -4.60517 instead of NA's. 
These need to be averaged for each row (i.e 2 values being averaged). What I 
would I need to change for me to meet the conditions:


1. If each run of the sample has a value, the average is
given
2. If only one run of the sample has a value, that value
is given (i.e. no
averaging)
3. If both runs have missing values, NA is given.
I have tried changing (na.strings = NA) to (na.strings = -4.60517) but this 
causes all the pairs with even one NA return an NA value. I would prefer not to 
use for loops as this would slow the script down considerably.

#SCRIPT STARTS
rm(list=ls())
setwd(C:/Documents and Settings/Amit Patel)

#na.strings makes na's readable
zz - read.csv(Pr9549_LabelFreeData_ByExperimentAK.csv,strip.white = TRUE, 
na.strings = NA)

ix - seq(from=2,to=62, by=2)
averagedResults - (zz[,ix] + zz[,ix+1])/2
averagedResults -  cbind(zz[,1],averagedResults )
colnames(averagedResults)  - 
c(PCI,G1-C1,G1-C2,G1-C3,G1-C4,G1-C5,G1-C6,G1-C7,G1-C8,
G2-C9,G2-C10,G2-C11,G2-C12,G2-C13,G2-C14,G2-C15,G2-C16,
G3-C17,G3-C18,G3-C19,G3-C20,G3-C21,G3-C22,G3-C23,
G4-C24,G4-C25,G4-C26,G4-C27,G4-C28,G4-C29,G4-C30,G4-C31)

write.csv(averagedResults, file = Pr9549_averagedreplicates.csv, row.names = 
FALSE)
#SCRIPT ENDS





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] searching for elements

2009-07-15 Thread Chyden Finance


Hello!

For the past three years, I have been using R extensively in my PhD  
program in Finance for statistical work.  Normally, I can figure this  
kind of thing out on my own, but I am completely stumped as to why the  
following code does not work.


I have two variables: sigs and p0_recent.

dim(sigs) = 296 3
dim(p0_recent) = 504 7

I want to compare each element in the first column of sigs with the  
elements in the first column of p0_recent.


In other words, does sigs[i,1] == p0_recent[j,1], where i = 1:dim(sigs) 
[1] and j = 1:dim(p0_recent)[1].


I've been trying:

 for (j in 1:dim(p0_recent)[1]) {
+ for (i in 1:dim(sigs)[1]) {
+ if (sigs[i,1] == p0_recent[j,1]) {
+ print(sigs[i,1])
+ }}}

But, I get:

Error in Ops.factor(sigs[i, 1], p0_recent[j, 1]) :
  level sets of factors are different

Is there a better way than for loops to compare each element in one  
column to each element in another column of different length?  If not,  
can anyone suggest a solution?


CWE

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Differing Variable Length Inconsistencies in Random Effects/Regression Models

2009-07-15 Thread A Singh



Dear All,

I am quite new to R and am having a problem trying to run a linear model 
with random effects/ a regression- with particular regard to my variable 
lengths being different and the models refusing to compute any further.


The codes I have been using are as follows:

vc-read.table(P:\\R\\Testvcomp10.txt,header=T)

attach(vc)


family-factor(family)
colms-(vc)[,4:13] ## this to assign the 10 columns containing marker
datato a new variable, as column names are themselves not in any
recognisable sequence

vcdf-data.frame(family,peg.no,ec.length,syll.length,colms)
library(lme4)



for (c in levels(family))

+ {for (i in 1:length(colms))
+{ fit-lmer(peg.no~1 + (1|c/i), vcdf)
+}
+summ-summary(fit)
+av-anova(fit)
+print(summ)
+print(av)
+ }

This gives me:

Error in model.frame.default(data = vcdf, formula = peg.no ~ 1 + (1 +  :
 variable lengths differ (found for 'c')



I had posted a similar message on the R mixed model list a few days ago, 
with respect to my fundamental methods, and Jerome Goudet had kindly 
referred me to an alternative approach using residuals obtained from a 
random effects model in lmer(), and then doing regressions using those 
[residuals being the dependent variable and my marker data columns the 
independent variable].


The code for that is as follows:

vc-read.table(P:\\R\\Text 
Files\\Testvcomp10.txt,header=T,sep=,dec=.,na.strings=NA,strip.white=T)

attach(vc)

family-factor(family)
colms-(vc)[,4:13]

names(vc)
[1] male.parent  family   offspring.id P1L55P1L73 

[6] P1L74P1L77P1L91P1L96P1L98 

[11] P1L100   P1L114   P1L118   peg.no 
ec.length

[16] syll.length


vcdf-data.frame(family, colms, peg.no, ec.length, syll.length)

library(lme4)



famfit-lmer(peg.no~1 + (1|family), na.action=na.omit, vcdf)
resfam-residuals(famfit)
for( i in 1:length(colms))

+ {
+ print (Marker, i)
+ regfam-abline(lm(resfam~i))
+ print(regfam)
+ }

This again gives me the error:


[1] Marker
Error in model.frame.default(formula = resfam ~ i, drop.unused.levels = 
TRUE) :

 variable lengths differ (found for 'i')


My variables all have missing values somewhere or the other. The missing 
values are not consistent for all individuals, i.e some individuals have 
some values missing, others have others,
and as much as I have tried to use na.action=na.omit and na.rm=T, the 
differing variable length problem is dogging me persistently..


I also tried to isolate the residuals, save them in a new variable (called 
'resfam' here), and tried to save that in the data.frame()-vcdf, that I 
had created earlier.


The problem with that was that when the residuals were computed, lmer() 
ignored missing data in 'peg.no' with respect to 'family', which is 
obviously not the same data missing for say another variable E.g. 
'ec.length'- leading again to an inconsistency in variable lengths. 
Data.frame would then not accept that addition at all to the previous set.
This was fairly obvious right from the start, but I decided to try it 
anyway. Didn't work.


I apologise if the solution to working with these different variable 
lengths is obvious and I don't know it- but I don't know R that well at all.


My data files can be downloaded at the following location:

http://www.filesanywhere.com/fs/v.aspx?v=896d6b88616173be71ab (excel-
.xlsx)

http://www.filesanywhere.com/fs/v.aspx?v=896d6b88616174a76e9e
(.txt file)


Any pointers would be greatly appreciated, as this is holding me up loads.

Thanks a ton for your help,

Aditi



--
A Singh
aditi.si...@bristol.ac.uk
School of Biological Sciences
University of Bristol

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] searching for elements

2009-07-15 Thread Steve Lianoglou


Hi,

On Jul 15, 2009, at 1:00 PM, Chyden Finance wrote:


Hello!

For the past three years, I have been using R extensively in my PhD  
program in Finance for statistical work.  Normally, I can figure  
this kind of thing out on my own, but I am completely stumped as to  
why the following code does not work.


I have two variables: sigs and p0_recent.

dim(sigs) = 296 3
dim(p0_recent) = 504 7

I want to compare each element in the first column of sigs with the  
elements in the first column of p0_recent.


In other words, does sigs[i,1] == p0_recent[j,1], where i =  
1:dim(sigs)[1] and j = 1:dim(p0_recent)[1].


I've been trying:

 for (j in 1:dim(p0_recent)[1]) {
+ for (i in 1:dim(sigs)[1]) {
+ if (sigs[i,1] == p0_recent[j,1]) {
+ print(sigs[i,1])
+ }}}

But, I get:

Error in Ops.factor(sigs[i, 1], p0_recent[j, 1]) :
 level sets of factors are different


It seems that this particular problem is due to the fact that you are  
comparing two sets of factors with different levels, which is what the  
Ops.factor error is saying. But let's make it more clear:


R f1 - factor(c('a','b','c'))
R f1
[1] a b c
 f2 - factor(c('c','d','e'))
[1] c d e
Levels: c d e
 f1[3] == f2[1]
Error in Ops.factor(f1[3], f2[1]) : level sets of factors are different

Is there a better way than for loops to compare each element in one  
column to each element in another column of different length?  If  
not, can anyone suggest a solution?



Probably better ways, but how about starting by first nuking the inner  
loop? Something like this should work:


for (j in 1:nrow(p0_recent)) {
  found - match(p0_recent[j,1], sigs[,1])
  ...
}

In each iteration found will be NA if the p0_recent element does not  
appear in sigs[,1], otherwise it will be a vector of indices into  
sigs[,1] that equal the p0_recent element you're querying with.


(Assuming you fix your factor/level issue)

HTH,
-steve

--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University

Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Spaces in a name

2009-07-15 Thread Allan Engelhardt

Not sure how to get formula to handle spaces, but to get going you can 
just change the names in the data frame after you have read it, e.g.

names(mcReg) - make.names(names(mcReg))

will (usually) give you Small.Bank.Acquired.

Allan

On 15/07/09 15:40, Idgarad wrote:
 I am reading regressors from an excel file (I have no control over the file)
 and some of the element names have spaces:

 i.e. Small Bank Aquired

 but I have found that lm(SourceData ~ . - Small Bank Aquired, mcReg)
 doesn't work (mcReg = modelCurrentRegressors)

 As they are toggles I have ran them through factor() to be treated propertly
 as 0 or 1 but due to the fact I am grabbing automagically the first 2/3rds
 of the data some of the regressors are either all 0s or all 1s accordingly
 so I need to take them out of the model by hand for now until I find a nice
 automatic method for removing regressors that only have 1 factor.

 So Primarily: how do I handle names that include spaces in this context and
 as a bonus: Anyone have a nice method for yanking regressors that only have
 a single factor in them from the lm() function?


 e.g. (for the following 30 elements)
 0,0,0,0,0,0,0,0,0,0,
 0,0,0,0,0,0,0,0,0,0,
 1,1,1,1,1,1,1,1,1,1

 As you can see grabbing the first 2/3rds is all 0s and the last 1/3rd is all
 ones (doing in-sample forecast diagnostic building the model only on the
 first 2/3rds of data, then forecasting the next 1/3rd and comparing.)

 Sorry if I am rambling a bit, still on cup of coffee #1...

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] searching for elements

2009-07-15 Thread Steve Lianoglou


One more thing,


Error in Ops.factor(sigs[i, 1], p0_recent[j, 1]) :
level sets of factors are different


It seems that this particular problem is due to the fact that you  
are comparing two sets of factors with different levels, which is  
what the Ops.factor error is saying. But let's make it more clear:


R f1 - factor(c('a','b','c'))
R f1
[1] a b c
 f2 - factor(c('c','d','e'))
[1] c d e
Levels: c d e
 f1[3] == f2[1]
Error in Ops.factor(f1[3], f2[1]) : level sets of factors are  
different



One way to fix would be to ensure all factors have the same levels  
from the get go, like:


It seems that your first problem you're facing looks like you are  
comparing two sets of factors with different levels, which is what the  
Ops.factor error is saying. But let's make it more clear:


R f1 - factor(c('a','b','c'), levels=c('a','b','c','d','e'))
R f2 - factor(c('c','d','e'), levels=c('a','b','c','d','e'))

R f1
[1] a b c
Levels: a b c d e
R f2
[1] c d e
Levels: a b c d e

R f1[3] == f2[1]
[1] TRUE

--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University

Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RODBC results from stored procedure

2009-07-15 Thread tradenet


Short of uploading a SQL server database, I don't think I can make this
example reproducible, but I hope it's not so complicated as to require
reproducibility

I'm using RODBC to get data from Microsoft SQL Server.
I can call a parametrized stored procedure without a problem and the proc
does indeed execute successfully.  However, even though the proc returns the
results I found that I had to modify the proc so that, in addition to
returning the results to the caller, it also saved the results to an actual
SQL Server table.  Then I was able to make second call to sqlQuery with a
simple select statement for the results table and retrieve the results back
into R.  My question is:  can I get stored proc results directly back to R
without having to populate and query a results table?


dbdata-sqlQuery(conn,sp_GetReturns,'2009-07-10','2009-07-14')
head(dbdata)
character(0)  [no data here]
dbdata-sqlQuery(conn,select * from returns order by Date asc) 
[...success...]

  Date  Asset1   Asset2
2009-07-10  0.010.02
2009-07-13  0.007  -0.003
...

I'd appreciate any suggestions,

Andrew

W






rm(dbdata)


#run query in query analyzer and get date from rtnsUnderscored table
#dbdata-sqlFetch(conn,sqtable=sp_GetModelRtns4OptimizeRangeVer4
5000,'2001-10-01','2009-07-14)
#dbdata-sqlFetch(conn,sqtable=sp_GetModelRtns4OptimizeRangeVer4
5000,'2009-07-01','2009-07-14)

dbdata-sqlQuery(conn,select * from rtnsUnderscored order by Date asc)
-- 
View this message in context: 
http://www.nabble.com/RODBC-results-from-stored-procedure-tp24503096p24503096.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] duplicate data points on a line graph

2009-07-15 Thread NDC/jshipman


Hi,
	I am new to R plot.  I am trying to increase the data point  
observation when duplicate data points exist


x   y
1   10
1   10
2   3
4   5
9   8


in the about example  1, 10 would be displayed larger than the other  
data points.  Could someone give me some assistance with this problem





757-864-7114
LARC/J.L.Shipman/jshipman
jeffery.l.ship...@nasa.gov

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] duplicate data points on a line graph

2009-07-15 Thread Dieter Menne




LARC/J.L.Shipman/jshipman wrote:
 
   I am new to R plot.  I am trying to increase the data point  
 observation when duplicate data points exist
 
 x y
 1 10
 1 10
 2 3
 4 5
 9 8
 
 in the about example  1, 10 would be displayed larger than the other  
 data points.  Could someone give me some assistance with this problem
 

Not exactly what you want, but ?sunflowerplot might come close.

Dieter

-- 
View this message in context: 
http://www.nabble.com/duplicate-data-points-on-a-line-graph-tp24503299p24503357.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] duplicate data points on a line graph

2009-07-15 Thread Chuck Cleland

On 7/15/2009 2:19 PM, NDC/jshipman wrote:
 Hi,
 I am new to R plot.  I am trying to increase the data point
 observation when duplicate data points exist
 
 xy
 110
 110
 23
 45
 9 8
 
 
 in the about example  1, 10 would be displayed larger than the other
 data points.  Could someone give me some assistance with this problem

  A couple of simple approaches:

x - c(1,1,2,4,9)

y - c(10,10,3,5,8)

plot(jitter(x), jitter(y))

plot(x, y, cex=c(2,2,1,1,1))

 757-864-7114
 LARC/J.L.Shipman/jshipman
 jeffery.l.ship...@nasa.gov
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Spaces in a name

2009-07-15 Thread Gabor Grothendieck

Try this with built in BOD:

 names(BOD) - paste(names(BOD), X)
 BOD
  Time X demand X
1  1  8.3
2  2 10.3
3  3 19.0
4  4 16.0
5  5 15.6
6  7 19.8
 lm(`demand X` ~ `Time X`, BOD)

Call:
lm(formula = `demand X` ~ `Time X`, data = BOD)

Coefficients:
(Intercept) `Time X`
  8.5211.721




On Wed, Jul 15, 2009 at 10:40 AM, Idgaradidga...@gmail.com wrote:
 I am reading regressors from an excel file (I have no control over the file)
 and some of the element names have spaces:

 i.e. Small Bank Aquired

 but I have found that lm(SourceData ~ . - Small Bank Aquired, mcReg)
 doesn't work (mcReg = modelCurrentRegressors)

 As they are toggles I have ran them through factor() to be treated propertly
 as 0 or 1 but due to the fact I am grabbing automagically the first 2/3rds
 of the data some of the regressors are either all 0s or all 1s accordingly
 so I need to take them out of the model by hand for now until I find a nice
 automatic method for removing regressors that only have 1 factor.

 So Primarily: how do I handle names that include spaces in this context and
 as a bonus: Anyone have a nice method for yanking regressors that only have
 a single factor in them from the lm() function?


 e.g. (for the following 30 elements)
 0,0,0,0,0,0,0,0,0,0,
 0,0,0,0,0,0,0,0,0,0,
 1,1,1,1,1,1,1,1,1,1

 As you can see grabbing the first 2/3rds is all 0s and the last 1/3rd is all
 ones (doing in-sample forecast diagnostic building the model only on the
 first 2/3rds of data, then forecasting the next 1/3rd and comparing.)

 Sorry if I am rambling a bit, still on cup of coffee #1...

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] duplicate data points on a line graph

2009-07-15 Thread baptiste auguie

Alternatively, you could make use of transparency (on some devices), or use
ggplot2 to map the number of observations to the point size,

d =
read.table(textConnection(
x   y
1   10
1   10
2   3
4   5
9   8
),head=T)

library(ggplot2)

# transparency
qplot(x, y, data=d, alpha=I(0.5))

d2 = unique(ddply(d,.(x,y), transform, count=length(x)))
# mapping number of obs.
qplot(x, y, data=d2,size=count)

HTH,

baptiste

2009/7/15 Chuck Cleland cclel...@optonline.net

 On 7/15/2009 2:19 PM, NDC/jshipman wrote:
  Hi,
  I am new to R plot.  I am trying to increase the data point
  observation when duplicate data points exist
 
  xy
  110
  110
  23
  45
  9 8
 
 
  in the about example  1, 10 would be displayed larger than the other
  data points.  Could someone give me some assistance with this problem

   A couple of simple approaches:

 x - c(1,1,2,4,9)

 y - c(10,10,3,5,8)

 plot(jitter(x), jitter(y))

 plot(x, y, cex=c(2,2,1,1,1))

  757-864-7114
  LARC/J.L.Shipman/jshipman
  jeffery.l.ship...@nasa.gov
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 --
 Chuck Cleland, Ph.D.
 NDRI, Inc. (www.ndri.org)
 71 West 23rd Street, 8th floor
 New York, NY 10010
 tel: (212) 845-4495 (Tu, Th)
 tel: (732) 512-0171 (M, W, F)
 fax: (917) 438-0894

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RODBC results from stored procedure

2009-07-15 Thread Dieter Menne




tradenet wrote:
 
 Short of uploading a SQL server database, I don't think I can make this
 example reproducible, but I hope it's not so complicated as to require
 reproducibility
 
 I can call a parametrized stored procedure without a problem and the proc
 does indeed execute successfully. 
 

This works for me with the popular Northwind database

channel = odbcConnect(northwind) # Assume this is configured correctly
sqlQuery(channel,EXEC CustOrderHist @CustomerID=ALFKI)

Try with a non-date query first, the switch to the tricky date format in the
parameter.

Dieter


-- 
View this message in context: 
http://www.nabble.com/RODBC-results-from-stored-procedure-tp24503096p24503854.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] GLM Gamma Family logLik formula?

2009-07-15 Thread Skipper Seabold

Hello all,

I was wondering if someone can enlighten me as to the difference
between the logLik in R vis-a-vis Stata for a GLM model with the gamma
family.

Stata calculates the loglikelihood of the model as (in R notation)
some equivalent function of

-1/scale * sum(Y/mu+log(mu)+(scale-1)*log(Y)+log(scale)+scale*lgamma(1/scale))

where scale (or dispersion) = 1, Y = the response variable, and mu is
the fitted values given by the fitted model.

R seems to use a very similar approach, but scale is set equal to the
calculated dispersion for the gamma model.  However, when I calculate
the logLik by hand this way the answer differs slightly (about .5)
from the logLik(glm.m1).

I haven't been able to figure out why looking at the help.  If anyone
has any ideas, the insight would be much appreciated.

Cheers,

Skipper

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] strategy to iterate over repeated measures/longitudinal data

2009-07-15 Thread Juliet Hannah

Hi Group,

Create some example data.

set.seed(1)
wide_data - data.frame(
id=c(1:10),
predictor1 = sample(c(a,b),10,replace=TRUE),
predictor2 = sample(c(a,b),10,replace=TRUE),
   predictor3 = sample(c(a,b),10,replace=TRUE),
measurement1=rnorm(10),
measurement2=rnorm(10))

head(wide_data)

  id predictor1 predictor2 predictor3 measurement1 measurement2
1  1  a  a  b  -0.04493361  -0.05612874
2  2  a  a  a  -0.01619026  -0.15579551
3  3  b  b  b   0.94383621  -1.47075238
4  4  b  a  a   0.82122120  -0.47815006
5  5  a  b  a   0.59390132   0.41794156
6  6  b  a  a   0.91897737   1.35867955

The measurements are repeated measures, and I am looking at one
predictor at a time. In the actual problem, there are around 400,000
predictors (again, one at a time).

Now, I want to use multiple measurements (the responses) to run a
regression of measurements on a predictor. So I will convert this data
from wide to long format.

I want to iterate through each predictor. So one (inefficient) way is
shown below.

For each predictor:
1. create a long data set using the predictor and all measurements
(using make.univ function from  multilevel package)
2. run model, extract the coefficient of interest
3. go to next predictor

The end result is a vector of 400,000 coefficients.

I'm sure this can be improved upon. I will be running this on a unix
cluster with 16G.
In the wide format, there are 2000 rows (individuals). With 4 repeated
measures,  it seems converting everything up front could be
problematic. Also, I'm not sure how to iterate through that (maybe
putting it in a list).  Any suggestions?

Thanks for your help.

Juliet


Here is the inefficient, working code.

library(multilevel)
library(lme4)

#Same data as above
set.seed(1)
wide_data - data.frame(
id=c(1:10),
predictor1 = sample(c(a,b),10,replace=TRUE),
predictor2 = sample(c(a,b),10,replace=TRUE),
   predictor3 = sample(c(a,b),10,replace=TRUE),
measurement1=rnorm(10),
measurement2=rnorm(10))


#vector of names to iterate over
predictor_names - colnames(wide_data)[2:4]
#vector to store coefficients
mycoefs - rep(-1,length(predictor_names))
names(mycoefs) - predictor_names
for (predictor in predictor_names)
{
   long_data -  make.univ( data.frame(wide_data$id,wide_data[,predictor]),
data.frame(
 wide_data$measurement1,
 wide_data$measurement2
)
  )
   names(long_data) - c('id', 'predictor', 'time','measurement')
   myfit - lmer(measurement ~ predictor + (1|id),data=long_data)
   mycoefs[predictor] - my...@fixef[2]
}

mycoefs

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] storing lm() results and other objects in a list

2009-07-15 Thread Idgarad

to clean up some code I would like to make a list of arbitrary length
to store various objects for use in a loop

sample code:


 BEGIN SAMPLE ##
# You can see the need for a loop already
linearModel1=lm(modelSource ~ .,mcReg)
linearModel2=step(linearModel1)
linearModel3=lm(modelSource ~ .-1,mcReg)
linearModel4=step(linearModel3)
#custom
linearModel5=lm(modelSource ~ . -ACF-MonthlyST1-MonthlyST2-MonthlyBLA,mcReg)

LinearModel1.res - residuals(linearModel1)
LinearModel2.res - residuals(linearModel2)
LinearModel3.res - residuals(linearModel3)
LinearModel4.res - residuals(linearModel4)
LinearModel5.res - residuals(linearModel5)

#hmmm bolt on linearModel[x] as linearModel[x]$arma.fit?
arma1.fit - auto.arima(LinearModel1.res)
arma2.fit - auto.arima(LinearModel2.res)
arma3.fit - auto.arima(LinearModel3.res)
arma4.fit - auto.arima(LinearModel4.res)
arma5.fit - auto.arima(LinearModel5.res,stepwise=T,trace=T)

#Ok what is left over after Regression and ARIMA that cannot
#be explained. Stupid outliers
#AO's can be added to the cReg as a normal dummy variable
# but these are AOs from the model not the original data.
# is it better to handle AOs from the original data?

#linearModel[x]arma.ao?
arma1.ao - detectAO(arma1.fit)
arma2.ao - detectAO(arma2.fit)
arma3.ao - detectAO(arma3.fit)
arma4.ao - detectAO(arma4.fit)
arma5.ao - detectAO(arma5.fit)

#What do I do with an innovative outlier? Transfer function or what?
#auto.arima doesn't handle the IO=c(...) stuff Umm...
#transfer functions, etc. are a deficency in the script at this point

#linearModel[x]arma.io?
arma1.io - detectIO(arma1.fit)
arma2.io - detectIO(arma2.fit)
arma3.io - detectIO(arma3.fit)
arma4.io - detectIO(arma4.fit)
arma5.io - detectIO(arma5.fit)

#Sample on how to auto-grab regressors from DetectAO and DetectIO and
#appened them to our regression array. You'd have to do this for each model
#as the residuals are where the outliers are coming from and diff models
#would have different residuals left over. IO is best left to arimax functions
#directly. I assume at this point that AO's can be added to Regression tables
#if that is the case then REM out the IO lines and pass the detectIO results

#into the arimax(x,y,z,IO=detectIO(blah))

#
# Need a better understanding of how to address the AO and IO's in
this script before implementing them
# (Repeat for each model, cReg1,cReg2,etc..)
#
#cReg1=cReg
#fReg1=fReg
#for(i in arma1.io$ind){ print(i);cReg1[,paste(sep=
,IO,i)]=1*(seq(cReg1[,2])==i)}
#for(i in arma1.ao$ind){ print(i);cReg1[,paste(sep=
,AO,i)]=1*(seq(cReg1[,2])==i)}
#for(i in arma1.io$ind){ print(i);fReg1[,paste(sep=
,IO,i)]=1*(seq(fReg1[,2]))}
#for(i in arma1.ao$ind){ print(i);fReg1[,paste(sep=
,AO,i)]=1*(seq(fReg1[,2]))}


#Get the pdq,PDQs into a variable so we can re-feed it if neccessary
#oh crap absorbing this into LinearModel[x] looks ugly for syntax
arma1.fit$order=c(arma1.fit$arma[1],arma1.fit$arma[2],arma1.fit$arma[6])
arma2.fit$order=c(arma2.fit$arma[1],arma2.fit$arma[2],arma2.fit$arma[6])
arma3.fit$order=c(arma3.fit$arma[1],arma3.fit$arma[2],arma3.fit$arma[6])
arma4.fit$order=c(arma4.fit$arma[1],arma4.fit$arma[2],arma4.fit$arma[6])
arma5.fit$order=c(arma5.fit$arma[1],arma5.fit$arma[2],arma5.fit$arma[6])

arma1.fit$seasonal=c(arma1.fit$arma[3],arma1.fit$arma[4],arma1.fit$arma[7])
arma2.fit$seasonal=c(arma2.fit$arma[3],arma2.fit$arma[4],arma2.fit$arma[7])
arma3.fit$seasonal=c(arma3.fit$arma[3],arma3.fit$arma[4],arma3.fit$arma[7])
arma4.fit$seasonal=c(arma4.fit$arma[3],arma4.fit$arma[4],arma4.fit$arma[7])
arma5.fit$seasonal=c(arma5.fit$arma[3],arma5.fit$arma[4],arma5.fit$arma[7])

#these Two are used for linearModel2 and linearModel4, Get only the
#regressors that surived step removal.
newcReg=cReg[match(names(linearModel2$coeff[-1]),names(cReg))]
newfReg=fReg[match(names(linearModel2$coeff[-1]),names(fReg))]
newmcReg=mcReg[match(names(linearModel2$coeff[-1]),names(mcReg))]
newmfReg=mfReg[match(names(linearModel2$coeff[-1]),names(mfReg))]

#Scenario 1 - All Regressors Left In
newFit1.b - 
Arima(modelSource,order=arma1.fit$order,seasonal=list(order=arma1.fit$seasonal),xreg=mcReg,include.drift=F)

#Scenario 2 - Step Removal of Regressors
newFit2.b - 
Arima(modelSource,order=arma2.fit$order,seasonal=list(order=arma2.fit$seasonal),xreg=newmcReg,include.drift=F)

#Scenario 3 - All Regressors Left In with Intercept Removed
newFit3.b - 
Arima(modelSource,order=arma3.fit$order,seasonal=list(order=arma3.fit$seasonal),xreg=mcReg,include.drift=F)

#Scenario 4 - Step Removal of Regressors with Intercept Removed (I
have a feeling this is identical to #2 in results
newFit4.b - 
Arima(modelSource,order=arma4.fit$order,seasonal=list(order=arma4.fit$seasonal),xreg=newmcReg,include.drift=F)

#Scenario 5 - Robust1, For giggles and grins for now
newFit5.b - 
Arima(modelSource,order=arma5.fit$order,seasonal=list(order=arma5.fit$seasonal),xreg=newmcReg,include.drift=F)

#All the drift terms ones are broke as the drift

1 2 >

1 - 100 of 148 matches

Mail list logo