Re: [R] by function ??

2009-12-09 Thread Meyners, Michael, LAUSANNE, AppliedMathematics
Or, more general (if you need to include more than just one variable from 
TestData), something like

by(TestData, LEAID, function(x) median(x$RATIO))

Agreed, this is less appealing for the given example than Ista's code, but 
might help to better understand by and to generalize its use to other 
situations.
Michael 

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Ista Zahn
 Sent: Mittwoch, 9. Dezember 2009 02:54
 To: L.A.
 Cc: r-help@r-project.org
 Subject: Re: [R] by function ??
 
 Hi,
 I think you want
 
 by(TestData[ , RATIO], LEAID, median)
 
 -Ista
 
 On Tue, Dec 8, 2009 at 8:36 PM, L.A. ro...@millect.com wrote:
 
  I'm just learning and this is probably very simple, but I'm stuck.
    I'm trying to understand the by().
  This works.
  by(TestData, LEAID, summary)
 
  But, This doesn't.
 
  by(TestData, LEAID, median(RATIO))
 
 
  ERROR: could not find function FUN
 
  HELP!
  Thanks,
  LA
  --
  View this message in context: 
  http://n4.nabble.com/by-function-tp955789p955789.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 --
 Ista Zahn
 Graduate student
 University of Rochester
 Department of Clinical and Social Psychology http://yourpsyche.org
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R: Serial Correlation in panel data regression

2009-12-09 Thread Millo Giovanni
Dear Sayan,
 
no, unfortunately I don't think it will. 
 
Here's basically how coeftest() works: if you call the coeftest() function on a 
model object, say: 'mymodel', it will apply both a 'coef' and a 'vcov' method 
to mymodel in order to extract beta and vcov(beta) and do a Wald test. 
coeftest() works with many different kinds of models, represented by 'lm', 
'glm', 'plm' objects and so on, each containing a 'standard' covariance matrix, 
so that the default behaviour is just to extract this latter.
 
Alternatively, you can supply a vcov method of your choice to coeftest() and 
have it do robust testing etc., but it will still have to be one that fits your 
kind of model. So if 'mymodel' is a plm object, then 
 
 coeftest(mymodel, vcov=vcovHC)
 
will use the White-Arellano covariance matrix, which as observed is robust vs. 
serial correlation in its peculiar way, different from the Newey-West-based 
vcovHAC for 'lm' objects.
 
I'm too ignorant of the subject to give advice on tobit models, but a quick 
glance (?tobit) reveals that 'tobit' class objects inherit from 'survreg' ones, 
so that's the direction in which to look.
 
Maybe you are in a position to simply pool the data and use standard tobit and 
vcovHAC? Panel data would have N observations out of NT that are serially 
uncorrelated by construction, and of course this would imply the assumption of 
no individual effects whatsoever (but I am just guessing here...). 
 
Best wishes,
Giovanni



Da: sayan dasgupta [mailto:kitt...@gmail.com] 
Inviato: mercoledì 9 dicembre 2009 06:59
A: Millo Giovanni; Achim Zeileis; yves.croiss...@let.ish-lyon.cnrs.fr
Cc: r-help@r-project.org
Oggetto: Re: Serial Correlation in panel data regression



Dear Sir,
Thanks for your reply
But still exists a trick . Basically I want to do Panel Tobit. I am using the 
tobit function from the package (AER) on a panel data .
Suppose that Gasoline$lgaspcar is  a 0 inflated data and I do 
m1- tobit (as.formula(paste(lgaspcar ~, rhs)), data=Gasoline)

then if I do library(lmtest)

coeftest(m1,vcovHC)
Will it take account of the heteroskedasticity and serial correlation( within 
country ) of the data


Regards 
Sayan Dasgupta






On Tue, Dec 8, 2009 at 8:29 PM, Millo Giovanni giovanni_mi...@generali.com 
wrote:


Dear Sayan,

there is a vcovHC method for panel models doing the White-Arellano 
covariance matrix, which is robust vs. heteroskedasticity *and* serial 
correlation, although in a different way from that of vcovHAC. You can supply 
it to coeftest as well, just as you did. The point is in estimating the model 
as a panel model in the first place.

So this should do what you need: 


data(Gasoline, package=plm)
Gasoline$f.year=as.factor(Gasoline$year)

library(plm) 

rhs - -1 + f.year + lincomep+lrpmg+lcarpcap

pm1- plm(as.formula(paste(lgaspcar ~, rhs)), data=Gasoline, 
model=pooling)
library(lmtest)
coeftest(pm1, vcov=vcovHC)

Please refer to the package vignette for 'plm' to check what it does 
exactly. Let me know if there are any issues.

Best,
Giovanni 




-Original Message-
From: Achim Zeileis [mailto:achim.zeil...@wu-wien.ac.at]
Sent: Tue 08/12/2009 13.48
To: sayan dasgupta
Cc: r-help@R-project.org; yves.croiss...@let.ish-lyon.cnrs.fr; Millo 
Giovanni
Subject: Re: Serial Correlation in panel data regression

On Tue, 8 Dec 2009, sayan dasgupta wrote:

 Dear R users,
 I have a question here

 library(AER)
 library(plm)   
 library(sandwich)
 ## take the following data
 data(Gasoline, package=plm)
 Gasoline$f.year=as.factor(Gasoline$year)

 Now I run the following regression

 rhs - -1 + f.year + lincomep+lrpmg+lcarpcap
 m1- lm(as.formula(paste(lgaspcar ~, rhs)), data=Gasoline)
 ###Now I want to find the autocorrelation,heteroskedasticity adjusted
 standard errors as a part of coeftest
 ### Basically I would like to take care of the within country serial
 correlaion

 ###that is I want to do
 coeftest(m1, vcov=function(x) vcovHAC(x,order.by=...))

 Please suggest what should be the argument of order.by and whether 
that will
 give me the desired result

Currently, the default vcovHAC() method just implements the time series
case. A generalization to panel data is not yet available.

Maybe Yves and Giovanni (authors of plm) have done something in that
direction...

sorry,
Z





 
Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni contenute 
in questo messaggio sono riservate ed a uso esclusivo del 

Re: [R] grep() exclude certain patterns?

2009-12-09 Thread Gustaf Rydevik
Hi,
Just a quick note regarding google and R: I use www.rseek.org almost
exclusively, and it tends to give me the results I need. It is based on
google, but uses a number of smart tricks to ferret out R-relevant
information.

/Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] {Lattice} cloud() help

2009-12-09 Thread J Chen

Dear all,

I'm a lattice graphics newbie. I'm trying to make a cut through the 3D
scatterplot with cloud(), so adding a partially transparent plate
perpendicular to the z-axis in order to separate the cloud.

code:

library(lattice)
cloud(Sepal.Width~Petal.Length*Petal.Width, data=iris)

Could someone give me some hints on how to manipulate the panel functions in
this case?

Thanks!
Jimmy

-- 
View this message in context: 
http://n4.nabble.com/Lattice-cloud-help-tp955956p955956.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Exporting Contingency Tables with xtable

2009-12-09 Thread Gabor Grothendieck
Try the latex function in the Hmisc package.  Using the state.*
variables built into R for sake of example:

library(Hmisc)
latex(table(state.division, state.region), rowlabel = X, collabel =
Y, file = )



On Wed, Dec 9, 2009 at 12:04 AM, Na'im R. Tyson nty...@clovermail.net wrote:
 Dear R-philes:

 I am having an issue with exporting contingency tables with xtable().  I set
 up a contingency and convert it to a matrix for passing to xtable() as shown
 below.

 v.cont.table - table(v_lda$class, grps,
        dnn=c(predicted, observed))
 v.cont.mat - as.matrix(v.cont.table)

 Both produce output as follows:

                observed
 predicted  uh uh~
      uh  201  30
      uh~   6  10

 However, when I construct the latex table with xtable(v.cont.mat), I get a
 good table without the headings of predicted and observed.

 \begin{table}[ht]
 \begin{center}
 \begin{tabular}{rrr}
  \hline
   uh  uh\~{} \\
  \hline
 uh  201   30 \\
  uh\~{}    6   10 \\
   \hline
 \end{tabular}
 \end{center}
 \end{table}

 Question: is there any easy way to retain or re-insert the dimension names
 from the contingency table and matrix?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] New version of ecological modelling software available with improved R interface

2009-12-09 Thread Bio7

Dear R users,

again with the hope that this application can also be useful for some R
users,
i like to announce a new Windows version of the ecological modeling software
Bio7.

For information:

Bio7 uses a pure Rserve approach to interface Java and R and has a feature
rich GUI
to access and execute R methods from Java embedded in a Rich Client Platform
based on Eclipse.

In this release (1.4) a new R perspective is available with a spreadsheet
component and the R-Shell view to import, export data from Excel, OpenOffice
and CSV and to transfer them to a R workspace.
In addition new methods have been added to the R-Shell interface to transfer
data from and to the spreadsheet component.

http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/rperspective.htm
http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/rperspective.htm
 

Furthermore Bio7 1.4 embeds several tools for:

- Creation and analysis of spatial explicit simulation models.

- Image Analysis (embedded ImageJ). 
  Transfer of images to and from R is supported!
  For the limits see:
  
http://n4.nabble.com/Transfer-images-limits-and-tests-td622869.html#a622869
http://n4.nabble.com/Transfer-images-limits-and-tests-td622869.html#a622869 

- Fast communication between R and Java (with RServe) and
   the possibilty to use R methods inside Java, BeanShell and Groovy.
- Interpretation of Java and script creation (with BeanShell).
- Direct dynamic compilation of Java (Janino).
- Creation of methods for Java, BeanShell, Groovy and R
   (integrated editors for Java, R, BeanShell+Groovy).
- Sensitivity analysis with an embedded flowchart editor in which
  scripts, macros and compiled code can be dragged and executed.
- Creation of 3d OpenGL (Jogl) models. Dynamic data visualization from R
possible.
- Visualizations and simulations on an embedded 3d globe
   (World Wind Java SDK) see:
   
http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/worldwinddynamic.htm
http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/worldwinddynamic.htm
 
   

Overview of changes in 1.4:

http://n4.nabble.com/Bio7-1-4-released-td931650.html#a931650
http://n4.nabble.com/Bio7-1-4-released-td931650.html#a931650


Bio7 1.4 is available for Windows (with R and JRE embedded) and Linux (JRE
embedded) and can be downloaded from:

http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/index.html
http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/index.html 

With kind regards

M. Austenfeld

-- 
View this message in context: 
http://n4.nabble.com/New-version-of-ecological-modelling-software-available-with-improved-R-interface-tp955974p955974.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert a list of N dataframes to N dataframes

2009-12-09 Thread Karl Ove Hufthammer
On Mon, 7 Dec 2009 13:23:06 -0600 Mark Na mtb...@gmail.com wrote:
 This worked very nicely (thanks for plyr, Hadley) but now I would like to
 unlist my list into the individual dataframes, preferably with their
 original names (data1, etc).
 
 I've tried to do this with:
 
  ldply(datalist,unlist)

Are you perhaps looking for
?attach
?

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with split eating giga-bytes of memory

2009-12-09 Thread jim holtman
Here is an example:


 # create test data
 N - 100
 x - data.frame(a=sample(LETTERS, N, TRUE), b=sample(letters, N, TRUE),
+ c=as.numeric(1:N), d=runif(N))
 system.time({
+ x.df - split(x, x$a)  # split
+ print(sapply(x.df, function(a) sum(a$c)))
+ })
  A   B   C   D   E
F   G   H
19132375146 19261600080 19290064552 19355472666 19143448231 18973627622
19278423676 19362576931
  I   J   K   L   M
N   O   P
19405443596 19295695044 19052377988 19236047192 19143226220 19197703946
19297192525 19129252399
  Q   R   S   T   U
V   W   X
19272964991 19315856972 19355660155 19303178409 19242322477 19081573240
19309444512 19077003863
  Y   Z
19259313705 19228653862
   user  system elapsed
   1.270.021.28
 # now use indices
 system.time({
+ x.indx - split(seq(nrow(x)), x$a)  # create list of indices
+ print(sapply(x.indx, function(a) sum(x$c[a])))
+ })
  A   B   C   D   E
F   G   H
19132375146 19261600080 19290064552 19355472666 19143448231 18973627622
19278423676 19362576931
  I   J   K   L   M
N   O   P
19405443596 19295695044 19052377988 19236047192 19143226220 19197703946
19297192525 19129252399
  Q   R   S   T   U
V   W   X
19272964991 19315856972 19355660155 19303178409 19242322477 19081573240
19309444512 19077003863
  Y   Z
19259313705 19228653862
   user  system elapsed
   0.230.000.23







On Tue, Dec 8, 2009 at 10:26 PM, Mark Kimpel mwkim...@gmail.com wrote:

 Jim, could you provide a code snippit to illustrate what you mean?

 Hadley, good point, I did not know that.

 Mark

 Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN  46074

 (317) 490-5129 Work,  Mobile  VoiceMail
 (317) 399-1219 Skype No Voicemail please


   On Tue, Dec 8, 2009 at 11:00 PM, jim holtman jholt...@gmail.com wrote:

 Also instead of 'splitting' the data frame, I split the indices and then
 use those to access the information in the original dataframe.


 On Tue, Dec 8, 2009 at 9:54 PM, Mark Kimpel mwkim...@gmail.com wrote:

 Hadley, Just as you were apparently writing I had the same thought and
 did
 exactly what you suggested, converting all columns except the one that I
 want split to character. Executed almost instantaneously without problem.
 Thanks! Mark

 Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN  46074

 (317) 490-5129 Work,  Mobile  VoiceMail
 (317) 399-1219 Skype No Voicemail please


  On Tue, Dec 8, 2009 at 10:48 PM, hadley wickham h.wick...@gmail.com
 wrote:

  Hi Mark,
 
  Why are you using factors?  I think for this case you might find
  characters are faster and more space efficient.
 
  Alternatively, you can have a look at the plyr package which uses some
  tricks to keep memory usage down.
 
  Hadley
 
  On Tue, Dec 8, 2009 at 9:46 PM, Mark Kimpel mwkim...@gmail.com
 wrote:
   Charles, I suspect your are correct regarding copying of the
 attributes.
   First off, selectSubAct.df is my real data, which turns out to be
 of
  the
   same dim() as myDataFrame below, but each column is make up of
 strings,
  not
   simple letters, and there are many levels in each column, which I did
 not
   properly duplicate in my first example. I have ammended that below
 and
  with
   the split the new object size is now not 10X the size of the
 original,
  but
   100X. My real data is even more complex than this, so I suspect
 that is
   where the problem lies. I need to search for a better solution to my
  problem
   than split, for which I will start a separate thread if I can't
 figure
   something out.
  
   Thanks for pointing me in the right direction,
  
   Mark
  
   myDataFrame - data.frame(matrix(paste(The rain in Spain,
   as.character(1:1400), sep = .), ncol = 7, nrow = 399000))
   mySplitVar - factor(paste(Rainy days and Mondays,
  as.character(1:1400),
   sep = .))
   myDataFrame - cbind(myDataFrame, mySplitVar)
   object.size(myDataFrame)
   ## 12860880 bytes # ~ 13MB
   myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar)
   object.size(myDataFrame.split)
   ## 1,274,929,792 bytes ~ 1.2GB
   object.size(selectSubAct.df)
   ## 52,348,272 bytes # ~ 52MB
   Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
   Indiana University School of Medicine
  
   15032 Hunter Court, Westfield, IN  46074
  
   (317) 490-5129 Work,  Mobile  VoiceMail
   (317) 399-1219 Skype No Voicemail please
  
  
   On Tue, Dec 8, 2009 at 10:22 PM, Charles C. Berry 
 cbe...@tajo.ucsd.edu
  wrote:
  
   On Tue, 8 Dec 2009, Mark Kimpel wrote:
  
 

Re: [R] conditionally merging adjacent rows in a data frame

2009-12-09 Thread Titus von der Malsburg
On Wed, Dec 9, 2009 at 12:11 AM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 Here are a couple of solutions.  The first uses by and the second sqldf:

Brilliant!  Now I have a whole collection of solutions.  I did a simple
performance comparison with a data frame that has 7929 lines.

The results were as following (loading appropriate packages is not included in
the measurements):

 times - c(0.248, 0.551, 41.080, 0.16, 0.190)
 names(times) - c(aggregate,summaryBy,by+transform,sqldf,tapply)
 barplot(times, log=y, ylab=log(s))

So sqldf clearly wins followed by tapply and aggregate.  summaryBy is slower
than necessary because it computes for x and dur both, mean /and/ sum.
by+transform presumably suffers from the contruction of many intermediate data
frames.

Are there any canonical places where R-recipes are collected?  If yes I would
write-up a summary.

These were the competitors:

 # Gary's and Nikhil's aggregate solution:

 aggregate.fixations1 - function(d) {

   idx  - c(TRUE,diff(d$roi)!=0)
   d2 - d[idx,]

   idx  - cumsum(idx)
   d2$dur - aggregate(d$dur, list(idx), sum)[2]
   d2$x   - aggregate(d$x, list(idx), mean)[2]

   d2
 }

 # Marek's symmaryBy:

 library(doBy)

 aggregate.fixations2 - function(d) {

   idx  - c(TRUE,diff(d$roi)!=0)
   d2 - d[idx,]

   d$idx  - cumsum(idx)
   d2$r - summaryBy(dur+x~idx, data=d, FUN=c(sum,
mean))[c(dur.sum, x.mean)]
   d2
 }

 # Gabor's by+transform solution:

 aggregate.fixations3 - function(d) {

   idx  - cumsum(c(TRUE,diff(d$roi)!=0))

   d2 - do.call(rbind, by(d, idx, function(x)
 transform(x, dur = sum(dur), x = mean(x))[1,,drop = FALSE ]))

   d2
 }

 # Gabor's sqldf solution:

 library(sqldf)

 aggregate.fixations4 - function(d) {

   idx  - c(TRUE,diff(d$roi)!=0)
   d2 - d[idx,]

   d$idx  - cumsum(idx)
   d2$r - sqldf(select sum(dur), avg(x) x from d group by idx)

   d2
 }

 # Titus' solution using plain old tapply:

 aggregate.fixations5 - function(d) {

   idx  - c(TRUE,diff(d$roi)!=0)
   d2 - d[idx,]

   idx  - cumsum(idx)
   d2$dur - tapply(d$dur, idx, sum)
   d2$x - tapply(d$x, idx, mean)

   d2
 }

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] conditionally merging adjacent rows in a data frame

2009-12-09 Thread Gabor Grothendieck
On Wed, Dec 9, 2009 at 7:59 AM, Titus von der Malsburg
malsb...@gmail.com wrote:
 On Wed, Dec 9, 2009 at 12:11 AM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 Here are a couple of solutions.  The first uses by and the second sqldf:

 Brilliant!  Now I have a whole collection of solutions.  I did a simple
 performance comparison with a data frame that has 7929 lines.

 The results were as following (loading appropriate packages is not included in
 the measurements):

  times - c(0.248, 0.551, 41.080, 0.16, 0.190)
  names(times) - c(aggregate,summaryBy,by+transform,sqldf,tapply)
  barplot(times, log=y, ylab=log(s))

 So sqldf clearly wins followed by tapply and aggregate.  summaryBy is slower
 than necessary because it computes for x and dur both, mean /and/ sum.
 by+transform presumably suffers from the contruction of many intermediate data
 frames.

 Are there any canonical places where R-recipes are collected?  If yes I would
 write-up a summary.

If you google for
   R wiki
its the first hit.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] conditionally merging adjacent rows in a data frame

2009-12-09 Thread Nikhil Kaza


This is great!! Sqldf is exactly the kind of thing I was looking for,  
other stuff.


I suppose you can speed up both functions 1 and 5 using aggregate and  
tapply only once, as was suggested earlier. But it comes at the  
expense of readability.


Nikhil

On 9 Dec 2009, at 7:59AM, Titus von der Malsburg wrote:


On Wed, Dec 9, 2009 at 12:11 AM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
Here are a couple of solutions.  The first uses by and the second  
sqldf:


Brilliant!  Now I have a whole collection of solutions.  I did a  
simple

performance comparison with a data frame that has 7929 lines.

The results were as following (loading appropriate packages is not  
included in

the measurements):

times - c(0.248, 0.551, 41.080, 0.16, 0.190)
names(times) - c(aggregate,summaryBy,by 
+transform,sqldf,tapply)

barplot(times, log=y, ylab=log(s))

So sqldf clearly wins followed by tapply and aggregate.  summaryBy  
is slower

than necessary because it computes for x and dur both, mean /and/ sum.
by+transform presumably suffers from the contruction of many  
intermediate data

frames.

Are there any canonical places where R-recipes are collected?  If  
yes I would

write-up a summary.

These were the competitors:

# Gary's and Nikhil's aggregate solution:

aggregate.fixations1 - function(d) {

  idx  - c(TRUE,diff(d$roi)!=0)
  d2 - d[idx,]

  idx  - cumsum(idx)
  d2$dur - aggregate(d$dur, list(idx), sum)[2]
  d2$x   - aggregate(d$x, list(idx), mean)[2]

  d2
}

# Marek's symmaryBy:

library(doBy)

aggregate.fixations2 - function(d) {

  idx  - c(TRUE,diff(d$roi)!=0)
  d2 - d[idx,]

  d$idx  - cumsum(idx)
  d2$r - summaryBy(dur+x~idx, data=d, FUN=c(sum,
mean))[c(dur.sum, x.mean)]
  d2
}

# Gabor's by+transform solution:

aggregate.fixations3 - function(d) {

  idx  - cumsum(c(TRUE,diff(d$roi)!=0))

  d2 - do.call(rbind, by(d, idx, function(x)
transform(x, dur = sum(dur), x = mean(x))[1,,drop =  
FALSE ]))


  d2
}

# Gabor's sqldf solution:

library(sqldf)

aggregate.fixations4 - function(d) {

  idx  - c(TRUE,diff(d$roi)!=0)
  d2 - d[idx,]

  d$idx  - cumsum(idx)
  d2$r - sqldf(select sum(dur), avg(x) x from d group by idx)

  d2
}

# Titus' solution using plain old tapply:

aggregate.fixations5 - function(d) {

  idx  - c(TRUE,diff(d$roi)!=0)
  d2 - d[idx,]

  idx  - cumsum(idx)
  d2$dur - tapply(d$dur, idx, sum)
  d2$x - tapply(d$x, idx, mean)

  d2
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bootstrapping in R

2009-12-09 Thread Trafim Vanishek
Dear all,

I have some error trying to bootstrap from a matrix. The error message is
Error in sample(n, n * R, replace = TRUE) : element 2 is empty;
   the part of the args list of '*' being evaluated was: (n, R)

vv - c(0.5,3.2,5.4,1.1,1.4,1.2,2.3,2.0)
Reg - matrix(data=vv, nrow = 4, ncol = 2)

bootcoeff - function(x){
coefficients(lm(x[,1]~x[,2]))[2]+1
}

boot(Reg, bootcoeff)

It is just an example, in reality I have a matrix in rows of which I have x
and y for which I need to make a regression to find the slope coeff
bootstrapping from rows.

Thanks a lot for the help.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bootstrapping in R

2009-12-09 Thread Trafim Vanishek
I missed number of bootstrap replicates R

boot(Reg, bootcoeff, R=10)
but still it doesn't work
Error in statistic(data, original, ...) : unused argument(s) (original)


On Wed, Dec 9, 2009 at 2:46 PM, Trafim Vanishek rdapam...@gmail.com wrote:

 Dear all,

 I have some error trying to bootstrap from a matrix. The error message is
 Error in sample(n, n * R, replace = TRUE) : element 2 is empty;
the part of the args list of '*' being evaluated was: (n, R)

 vv - c(0.5,3.2,5.4,1.1,1.4,1.2,2.3,2.0)
 Reg - matrix(data=vv, nrow = 4, ncol = 2)

 bootcoeff - function(x){
 coefficients(lm(x[,1]~x[,2]))[2]+1
 }

 boot(Reg, bootcoeff)

 It is just an example, in reality I have a matrix in rows of which I have x
 and y for which I need to make a regression to find the slope coeff
 bootstrapping from rows.

 Thanks a lot for the help.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why cannot get the expected values in my function

2009-12-09 Thread Gavin Simpson
On Tue, 2009-12-08 at 23:22 -0500, David Winsemius wrote:
 On Dec 8, 2009, at 11:07 PM, rusers.sh wrote:
 
  Hi,
   In the following function, i hope to save my simulated data into the
  result dataset, but why the final result dataset seems not to be
  generated.
 
 
  #Function
  simdata-function (nsim) {
 
 # Instead why not:
 cbind(x=runif(nsim), y=runif(nsim) )

or:

m - matrix(runif(nsim*2), ncol = 2)
## if names on m needed
colnames(m) - c(x,y)

G

   }
 
  #simulation
  simdata(10)  #correct result
   x   y
  [1,] 0.2655087 0.372123900
  [2,] 0.1848823 0.702374036
  [3,] 0.1680415 0.807516399
  [4,] 0.5858003 0.008945796
  [5,] 0.2002145 0.685218596
  [6,] 0.6062683 0.937641973
  [7,] 0.9889093 0.397745453
  [8,] 0.4662952 0.207823317
  [9,] 0.2216014 0.024233910
  [10,] 0.5074782 0.306768506
   But, the dataset result wasnot assigned the above values. What is  
  the
  problem?
  result  #wrong result??
x  y
  [1,] NA NA
  [2,] NA NA
  [3,] NA NA
  [4,] NA NA
  [5,] NA NA
  [6,] NA NA
  [7,] NA NA
  [8,] NA NA
  [9,] NA NA
  [10,] NA NA
 
  Thanks a lot.
  -- 
  -
  Jane Chang
  Queen's
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] code

2009-12-09 Thread Rahim Alhamzawi
Dear support, 
I want to compute the highest probability density for any data. please do you 
have any code to help me in this subject.
I am looking forward to hearing from you as soon as possible.  
 
rahim


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R echo code chunk runs off the page using Lyx and Sweav

2009-12-09 Thread Mark Connolly
I somehow missed the response posted by Ben Bolker.  He is quite correct 
(happily for me!):


\SweaveOpts{keep.source=TRUE}

in your LaTeX code will (I think) keep whatever manual formatting you do, 
in all code chunks (or use keep.source=TRUE) for particular code chunks

of concern




This information has made its way into the latest Sweave manual at 
http://www.statistik.lmu.de/~leisch/Sweave/.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] code

2009-12-09 Thread JLucke
?density
?max





Rahim Alhamzawi rahimalhamz...@yahoo.co.uk 
Sent by: r-help-boun...@r-project.org
12/09/2009 05:23 AM

To
r-h...@stat.math.ethz.ch
cc

Subject
[R] code






Dear support, 
I want to compute the highest probability density for any data. please do 
you have any code to help me in this subject.
I am looking forward to hearing from you as soon as possible.  
 
rahim


 
 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem in labeling the nodes of tree drawn by rpart

2009-12-09 Thread Terry Therneau
 In the nodes of the tree, the values of the covariates are represented
 with a, b or c (tree attached).

Try help('text.rpart'), and note the 'pretty' argument therein.
There is often not enough room for long labels, and so the default is to
do the severe truncation you speak of.

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Printing 'k' levels of factors 'n' times each, but 'n' is unequal for all levels ?

2009-12-09 Thread Uwe Ligges



A Singh wrote:

Dear List,

I need to print out each of 'k' levels of a factor 'n' times each, where 
'n' is the number of elements belonging to each factor.


I know that this can normally be done using the gl() command,
but in my case, each level 'k' has an unequal number of elements.

Example with code is as below:

vc-read.table(P:\\Transit\\CORRECT 
files\\Everything-newest.csv,header=T, sep=,, dec=., na.strings=NA, 
strip.white=T)


vcdf-data.frame(vc)

tempdf-data.frame(cbind(vcdf[,1:3], vcdf[,429]))
newtemp-na.exclude(tempdf)

newtemp[,2]-factor(newtemp[,2])

groupmean-tapply(newtemp[,4], newtemp[,2], mean)

newmark-factor(groupmean, exclude=(groupmean==0 | groupmean==1))
newmark

This is what the output is (going up to 61 levels)
1  2  3  4
NA  0.142857142857143  0.444   NA
5  6  8  9
0.33  0.09090909090  0.3846153846NA

. 61
NA

The variable 'groupmean' calculates means for newtemp[,4] for 61 levels 
(k). Levels are specified in newtemp[,2].


I now want to be able to print out each value of 'groupmean'  as many 
times as there are elements in the group for which each is calculated.


So for E.g. if level 1 of newtemp[,2] has about 15 elements, NA should 
be printed 15 times, level 2 = 12 times 0.1428, and so on.


Is there a way of specifying that a list needs to be populated with 
replicates of groupmeans based on values got from newtemp[,2]?


See ?mapply and ?rep, hence

mapply(rep, values, replicates)

where values and replicates are corresponding vectors.

Uwe Ligges




I just can't seem to figure this out by myself.

Many thanks for your help.

Aditi

--
A Singh
aditi.si...@bristol.ac.uk
School of Biological Sciences
University of Bristol

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bootstrapping in R

2009-12-09 Thread David Winsemius


On Dec 9, 2009, at 9:05 AM, Trafim Vanishek wrote:


I missed number of bootstrap replicates R

boot(Reg, bootcoeff, R=10)
but still it doesn't work
Error in statistic(data, original, ...) : unused argument(s)  
(original)



On Wed, Dec 9, 2009 at 2:46 PM, Trafim Vanishek  
rdapam...@gmail.com wrote:



Dear all,

I have some error trying to bootstrap from a matrix. The error  
message is

Error in sample(n, n * R, replace = TRUE) : element 2 is empty;
  the part of the args list of '*' being evaluated was: (n, R)

vv - c(0.5,3.2,5.4,1.1,1.4,1.2,2.3,2.0)
Reg - matrix(data=vv, nrow = 4, ncol = 2)

bootcoeff - function(x){
coefficients(lm(x[,1]~x[,2]))[2]+1
}

boot(Reg, bootcoeff)


?boot

And in particular you need to read the Arguments material more  
closely. The boot function is more complicated that you expected. Then  
work through the examples. You may also get help by doing some  
searching in www.rseek.org or with the RSiteSearch function, e.g.:


RsiteSearch(lm coef boot)



It is just an example, in reality I have a matrix in rows of which  
I have x

and y for which I need to make a regression to find the slope coeff
bootstrapping from rows.

Thanks a lot for the help.





David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Greek symbols on ylab= using barchart() {Lattice}

2009-12-09 Thread Peng Cai
Hi All,

I'm trying to write ug/m3 as y-label, with greek letter mu replacing u
AND 3 going as a power.

These commands works in general:

plot.new()
text(0.5, 0.5, expression(symbol(m)))

But, I'm sure about how to do it using barchart() from Lattice. Can anyone
help please?

Thanks,
Peng Cai

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Greek symbols on ylab= using barchart() {Lattice}

2009-12-09 Thread baptiste auguie
Hi,

try this,

barchart(1:2, ylab=expression(mu*g/m^3))

?plotmath

baptiste

2009/12/9 Peng Cai pengcaimaill...@gmail.com:
 Hi All,

 I'm trying to write ug/m3 as y-label, with greek letter mu replacing u
 AND 3 going as a power.

 These commands works in general:

 plot.new()
 text(0.5, 0.5, expression(symbol(m)))

 But, I'm sure about how to do it using barchart() from Lattice. Can anyone
 help please?

 Thanks,
 Peng Cai

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Warning for data.table (with ref)?

2009-12-09 Thread Peng Yu
I have following the message dim(refdata) and dimnames(refdata) no
longer allow parameter ref=TRUE, use dim(derefdata(refdata)),
dimnames(derefdata(refdata)) instead when I loaded data.table. Is it
from the package ref? Could it be fixed? Or there is something wrong
with my installation?

 library(data.table)
Loading required package: ref
dim(refdata) and dimnames(refdata) no longer allow parameter ref=TRUE,
use dim(derefdata(refdata)), dimnames(derefdata(refdata)) instead
 sessionInfo()
R version 2.10.0 (2009-10-26)
x86_64-unknown-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] data.table_1.2 ref_0.97

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Significant performance difference between split of a data.frame and split of vectors

2009-12-09 Thread Peng Yu
On Tue, Dec 8, 2009 at 11:06 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Dec 9, 2009, at 12:00 AM, Peng Yu wrote:

 On Tue, Dec 8, 2009 at 10:37 PM, David Winsemius dwinsem...@comcast.net
 wrote:

 On Dec 8, 2009, at 11:28 PM, Peng Yu wrote:

 I have the following code, which tests the split on a data.frame and
 the split on each column (as vector) separately. The runtimes are of
 10 time difference. When m and k increase, the difference become even
 bigger.

 I'm wondering why the performance on data.frame is so bad. Is it a bug
 in R? Can it be improved?

 You might want to look at the data.table package. The author calinms
 significant speed improvements over dta.frames

 This bug has been found long time back and a package has been
 developed for it. Should the fix be integrated in data.frame rather
 than be implemented in an additional package?

 What bug?

Is the slow speed in splitting a data.frame a performance bug?


 David.

 system.time(split(as.data.frame(x),f))

  user  system elapsed
  1.700   0.010   1.786

 system.time(lapply(

 +         1:dim(x)[[2]]
 +         , function(i) {
 +           split(x[,i],f)
 +         }
 +         )
 +     )
  user  system elapsed
  0.170   0.000   0.167

 ###
 m=3
 n=6
 k=3000

 set.seed(0)
 x=replicate(n,rnorm(m))
 f=sample(1:k, size=m, replace=T)

 system.time(split(as.data.frame(x),f))

 system.time(lapply(
      1:dim(x)[[2]]
      , function(i) {
        split(x[,i],f)
      }
      )
  )

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Assign variables in a loop to a list

2009-12-09 Thread Márcio Resende

Dear R helpers,
I am new in R and I am having trouble with a function.
I am proggraming a genetic analysis and there is a script that generates a
lot of different matrix, for example x,y and z.
And what I am trying to do is to loop the same script for different
variables the user will define in the function. Therefore the matrix x, y
and z are going to be created for each variable

myfun - function(...){#the variables in (...) for example (DAP,
Vol)
Vec = matrix(c(...))
for (i in seq(along = Vec)){
... ##generates x, y and z
# The scripts are not here because they are big

assign(paste(x,i,sep=),x)
assign(paste(y,i,sep=),y)  #this generates x1,y1,z1, x2, y2, and z2 for
the example with two variables
assign(paste(z,i,sep=),z)
}

Here is the step of my doubt where I can´t solve
 I want to assign those variables in a list function

structure(list(...), class = genotype) ## In the example it would be

##structure(list(varX1 = x1, varX2 = x2, varY1 = y1, varY2 = y2, varZ1 = z1,
varZ2 = z2), class = genotype)

} #end of function

However I don´t know how to assign those variables in this list because I
don´t know how many variables will the user declare

I am not sure if I was clear, I know it is hard without the whole script,
but I think it wouldn´t make any difference. It could be considered 3
randomly matrix generated each time (each loop).

Thank you very much for the help and for the time dispended



-- 
View this message in context: 
http://n4.nabble.com/Assign-variables-in-a-loop-to-a-list-tp956207p956207.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can elements of a list be passed as multiple arguments?

2009-12-09 Thread Peng Yu
On Tue, Dec 8, 2009 at 11:05 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Dec 8, 2009, at 11:37 PM, Peng Yu wrote:

 I want to split a matrix, where both 'u' and 'w' are results of
 possible ways. However, whenever 'n' changes, the argument passed to
 mapply() has to change. Is there a way to pass elements of a list as
 multiple arguments?

 You need to explain what you want in more detail. In your example mapply did
 exactly what you told it to. No errors. Three matrices. What were you
 expecting when you gave it three lists in each argument?

I want a general solution so that I don't have to always write
v[[1]], v[[2]], ..., v[[n]] like in the following, because the
following way would not work if 'n' is an arbitrary number.

w=mapply(function(x,y) {cbind(x,y)}, v[[1]], v[[2]], ..., v[[n]])

One way that I can think of is to somehow expand a list (i.e., v in
this case) to a set of arguments that can be passed to 'mapply()'.

 m=10
 n=2
 k=3

 set.seed(0)
 x=replicate(n,rnorm(m))
 f=sample(1:k, size=m, replace=T)

 u=split(as.data.frame(x),f)

 v=lapply(
   1:dim(x)[[2]]
   , function(i) {
     split(x[,i],f)
   }
   )

 w=mapply(
   function(x,y) {
     cbind(x,y)
   }
   , v[[1]], v[[2]]
   )

 --

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Greek symbols on ylab= using barchart() {Lattice}

2009-12-09 Thread Peng Cai
Hi Baptiste and Others,

Thanks for your help. I'm writing:

ylab=expression(Concentration(mu*g/m^3))

And its working fine, but is it possible to add a space between
Concentration and (mu*g/m^3).

Thanks again,
Peng Cai

On Wed, Dec 9, 2009 at 12:02 PM, baptiste auguie 
baptiste.aug...@googlemail.com wrote:

 Hi,

 try this,

 barchart(1:2, ylab=expression(mu*g/m^3))

 ?plotmath

 baptiste

 2009/12/9 Peng Cai pengcaimaill...@gmail.com:
   Hi All,
 
  I'm trying to write ug/m3 as y-label, with greek letter mu replacing
 u
  AND 3 going as a power.
 
  These commands works in general:
 
  plot.new()
  text(0.5, 0.5, expression(symbol(m)))
 
  But, I'm sure about how to do it using barchart() from Lattice. Can
 anyone
  help please?
 
  Thanks,
  Peng Cai
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Significant performance difference between split of a data.frame and split of vectors

2009-12-09 Thread Charles C. Berry

On Wed, 9 Dec 2009, Peng Yu wrote:


On Tue, Dec 8, 2009 at 11:06 PM, David Winsemius dwinsem...@comcast.net wrote:


On Dec 9, 2009, at 12:00 AM, Peng Yu wrote:


On Tue, Dec 8, 2009 at 10:37 PM, David Winsemius dwinsem...@comcast.net
wrote:


On Dec 8, 2009, at 11:28 PM, Peng Yu wrote:


I have the following code, which tests the split on a data.frame and
the split on each column (as vector) separately. The runtimes are of
10 time difference. When m and k increase, the difference become even
bigger.

I'm wondering why the performance on data.frame is so bad. Is it a bug
in R? Can it be improved?


You might want to look at the data.table package. The author calinms
significant speed improvements over dta.frames


This bug has been found long time back and a package has been
developed for it. Should the fix be integrated in data.frame rather
than be implemented in an additional package?


What bug?


Is the slow speed in splitting a data.frame a performance bug?



NO!

The two computations are not equivalent.

One is a list whose elements are split vectors, and the other is a list of 
data.frames containing those vectors.


If you take the trouble to assemble that list of data frames from the 
list of split vectors you will see that it is very time consuming.


Read up on memory management issues. Think about what the computer 
actually has to do in terms of memory access to split a data.frame versus 
split a vector.


---

And even if it were simply a matter of having code that is slow for some 
application, that would not be a bug. Read the FAQ!


Chuck






David.



system.time(split(as.data.frame(x),f))


 user  system elapsed
 1.700   0.010   1.786


system.time(lapply(


+         1:dim(x)[[2]]
+         , function(i) {
+           split(x[,i],f)
+         }
+         )
+     )
 user  system elapsed
 0.170   0.000   0.167

###
m=3
n=6
k=3000

set.seed(0)
x=replicate(n,rnorm(m))
f=sample(1:k, size=m, replace=T)

system.time(split(as.data.frame(x),f))

system.time(lapply(
     1:dim(x)[[2]]
     , function(i) {
       split(x[,i],f)
     }
     )
 )

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Announcing a new R news site: R-bloggers.com

2009-12-09 Thread Tal Galili
Hello Elijah,
You could have not made me happier with your letter.
It is a very satisfying compliment for me to be appreciated in general, and
specifically by you.


I will start by responding to the simpler parts of your e-mail and the
proceed into the more interesting parts.
My response grew to be quite long, though I tried to remain fresh. I hope it
turned out ok.


First, thank you for the historical context of PlanetR and also for putting
it up in the first place. I already took some steps as to contact people in
the PlanetR list, but not all have responded to me.

Now regarding the more general (implicit) questions that come up from your
letter and from Dirk's: what is the purpose of R-bloggers.com and of that of
PlanetR (especially when services like Goggle reader and other such sources
are abundant).

For me there are two audiences:
One is that of the web 2.0 power users. That is, people who know what RSS is
and use it, maybe evern write their own blogs. These people have only one
problem (as I see it) that R-bloggers tries to solve, and that is to know
who else lives in their ecosystem. Who else they should follow.
For that, google reader recommendation system is great, but not enough. A
much better system is if there was a one place where all R bloggers would
go, write down their website, and all of us would know they exist. That is
what R-bloggers offers for the power users. I think this is also why over 20
of them subscribed to the site RSS feed.
BTW, The origin of this idea came to me when I was trying to find all the
dance bloggers for my wife (who is a dance researcher and blogger herself).
After a while we started http://www.dancebloggers.com/ while knowing of only
10 bloggers. They list now has over 80 bloggers, most of which we would have
not known about without this hub.
The same thing I am trying to do for the R community, that is way I hope
more R bloggers would write about the service - so their network of readers
which includes other R bloggers would add themselves and we will all know
about them.
If that was my only purpose, a simple directory would have been enough. But
I also have a second one and that is to help the second audience.

The second audience I am thinking of are people of our community who are not
so much early adopters (and actually quite late adapters) of the new
facilities that the new web (a.k.a: web 2.0) provides.
To them the all RSS thing is too much to look at, and they are used to
e-mails. And because of that they are (until now) disconected from many of
the R bloggers out there, simply because it is in-efficient for them to go
through all these blogs each day (or even week). So for them, to see all the
content in one place (and even get an e-mail about it) would be (I hope) a
service. I believe that's why 5 of them (so far) has subscribed via e-mail.
I also hope teachers will direct their students to this as a resource for
getting a sense of what people who are using R are doing.
Another thing that hints me about the R community is seeing how the
facebook fan box is still empty. Which tells me that (sadly) very few R
users are actively using facebook as a means for connecting with the outer
networks of people out there.

All I wrote also explains why R-bloggers will only take feeds of bloggers
and only (as much as can be said) their posts that are centered around
R  (hence the website name :) ).
It both follows what Gabor talked about - having a site who's content is
only about R. But also what I wish, which is to have content in the sense
of articles to read (mostly). And not so much things like news feeds of
wikipedia or new packages published.


Regarding what you suggested of turning the site into being more of a
community enterprise, I don't see how to do that. Right now, the adding of
the feeds is a very simple process and the rate of people adding themselves
is very low, so I don't think I will need help in that. I would more love to
see more people in our community becoming even more social online, but I
don't thing that R bloggers http://www.r-bloggers.com/should be the place
for that but rather it should be on each of the blogs that write about R.
And also on services like http://crantastic.org/ which I really hope will
somehow be pushed more by the R core team so to serve all of us with more
input from the R community of users.

I hope this was at least an interesting read for some of you :)
And Elijah, *thanks again* for your kind words!
Best regards,
Tal







Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com/ (English)
--




On Tue, Dec 8, 2009 at 7:12 PM, Elijah Wright elijah.wri...@gmail.comwrote:


 Hi Tal!

 First let me say that I deeply appreciate the work that 

[R] equivalent of ifelse

2009-12-09 Thread carol white
Hi,
Is there any equivalent for ifelse (except if (cond) expr1 else expr2) which 
takes an atomic element as argument but returns vector since ifelse returns an 
object of the same length as its argument?

x = c(1,2,3)
y = c(4,5,6,7)
z = 3

ifelse(z = 3,x,y)

would return x and not 1

thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Greek symbols on ylab= using barchart() {Lattice}

2009-12-09 Thread baptiste auguie
barchart(1:2, ylab=expression(Concentration (*mu*g/m^3*)))

2009/12/9 Peng Cai pengcaimaill...@gmail.com:
 Hi Baptiste and Others,

 Thanks for your help. I'm writing:

 ylab=expression(Concentration(mu*g/m^3))

 And its working fine, but is it possible to add a space between
 Concentration and (mu*g/m^3).

 Thanks again,
 Peng Cai

 On Wed, Dec 9, 2009 at 12:02 PM, baptiste auguie 
 baptiste.aug...@googlemail.com wrote:

 Hi,

 try this,

 barchart(1:2, ylab=expression(mu*g/m^3))

 ?plotmath

 baptiste

 2009/12/9 Peng Cai pengcaimaill...@gmail.com:
   Hi All,
 
  I'm trying to write ug/m3 as y-label, with greek letter mu replacing
 u
  AND 3 going as a power.
 
  These commands works in general:
 
  plot.new()
  text(0.5, 0.5, expression(symbol(m)))
 
  But, I'm sure about how to do it using barchart() from Lattice. Can
 anyone
  help please?
 
  Thanks,
  Peng Cai
 
         [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] equivalent of ifelse

2009-12-09 Thread Henrique Dallazuanna
Try this:

list('TRUE' = x, 'FALSE' = y)[[as.character(as.name(z = 1))]]

On Wed, Dec 9, 2009 at 3:40 PM, carol white wht_...@yahoo.com wrote:
 Hi,
 Is there any equivalent for ifelse (except if (cond) expr1 else expr2) which 
 takes an atomic element as argument but returns vector since ifelse returns 
 an object of the same length as its argument?

 x = c(1,2,3)
 y = c(4,5,6,7)
 z = 3

 ifelse(z = 3,x,y)

 would return x and not 1

 thanks

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Greek symbols on ylab= using barchart() {Lattice}

2009-12-09 Thread Peng Cai
Thanks, it worked!

On Wed, Dec 9, 2009 at 12:46 PM, baptiste auguie 
baptiste.aug...@googlemail.com wrote:

 barchart(1:2, ylab=expression(Concentration (*mu*g/m^3*)))

 2009/12/9 Peng Cai pengcaimaill...@gmail.com:
  Hi Baptiste and Others,
 
  Thanks for your help. I'm writing:
 
  ylab=expression(Concentration(mu*g/m^3))
 
  And its working fine, but is it possible to add a space between
  Concentration and (mu*g/m^3).
 
  Thanks again,
  Peng Cai
 
  On Wed, Dec 9, 2009 at 12:02 PM, baptiste auguie 
  baptiste.aug...@googlemail.com wrote:
 
  Hi,
 
  try this,
 
  barchart(1:2, ylab=expression(mu*g/m^3))
 
  ?plotmath
 
  baptiste
 
  2009/12/9 Peng Cai pengcaimaill...@gmail.com:
Hi All,
  
   I'm trying to write ug/m3 as y-label, with greek letter mu
 replacing
  u
   AND 3 going as a power.
  
   These commands works in general:
  
   plot.new()
   text(0.5, 0.5, expression(symbol(m)))
  
   But, I'm sure about how to do it using barchart() from Lattice. Can
  anyone
   help please?
  
   Thanks,
   Peng Cai
  
  [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  
 
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] equivalent of ifelse

2009-12-09 Thread David Winsemius


On Dec 9, 2009, at 12:40 PM, carol white wrote:


Hi,
Is there any equivalent for ifelse (except if (cond) expr1 else  
expr2) which takes an atomic element as argument but returns vector  
since ifelse returns an object of the same length as its argument?


x = c(1,2,3)
y = c(4,5,6,7)
z = 3

ifelse(z = 3,x,y)

would return x and not 1


I worry that this is too simple, so wonder if you have expressed your  
intent clearly.


 if(z = 3) {x} else {y}
[1] 1 2 3





David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Significant performance difference between split of a data.frame and split of vectors

2009-12-09 Thread Peng Yu
On Wed, Dec 9, 2009 at 11:20 AM, Charles C. Berry cbe...@tajo.ucsd.edu wrote:
 On Wed, 9 Dec 2009, Peng Yu wrote:

 On Tue, Dec 8, 2009 at 11:06 PM, David Winsemius dwinsem...@comcast.net
 wrote:

 On Dec 9, 2009, at 12:00 AM, Peng Yu wrote:

 On Tue, Dec 8, 2009 at 10:37 PM, David Winsemius
 dwinsem...@comcast.net
 wrote:

 On Dec 8, 2009, at 11:28 PM, Peng Yu wrote:

 I have the following code, which tests the split on a data.frame and
 the split on each column (as vector) separately. The runtimes are of
 10 time difference. When m and k increase, the difference become even
 bigger.

 I'm wondering why the performance on data.frame is so bad. Is it a bug
 in R? Can it be improved?

 You might want to look at the data.table package. The author calinms
 significant speed improvements over dta.frames

 This bug has been found long time back and a package has been
 developed for it. Should the fix be integrated in data.frame rather
 than be implemented in an additional package?

 What bug?

 Is the slow speed in splitting a data.frame a performance bug?


 NO!

 The two computations are not equivalent.

 One is a list whose elements are split vectors, and the other is a list of
 data.frames containing those vectors.

I made a comparable example below. Still splitting data.frame is much
slower comparing with the second way that I'm showing.

 If you take the trouble to assemble that list of data frames from the list
 of split vectors you will see that it is very time consuming.

It is not as I show in the example below.

 Read up on memory management issues. Think about what the computer actually
 has to do in terms of memory access to split a data.frame versus split a
 vector.

I'd like to read more on how R do memory management. Would you please
point me a good source?

But again, R is not user friendly. It took me quite a long time to
figure out that splitting a data.frame is a bottle neck in my program
and reduce the problem into a test case. I don't know how memory
management is done in R so that I don't know if it is possible to fix
the problem for splitting a data.frame without perturbing the
interface of data.frame. But if the speed of splitting data.frame is
so slow, maybe it can be forbidden and an alternative can be
documented somewhere.

 ---

 And even if it were simply a matter of having code that is slow for some
 application, that would not be a bug. Read the FAQ!

The definition of a bug is on the FAQ is narrower than what I thought.
No matter what a definition of a bug is, split() on a data.frame is
perfectly legitimate operation (in terms of an interface). A quick fix
to this problem is to at least single out the case where the argument
is a data.frame, and to do what I have been doing below. Therefore,
that is why I say this is a performance bug. Similar cases, where a
faster alternative can be done but is not done, are perfect to call
bugs, at least in many other languages.

 m=30
 n=6
 k=3

 set.seed(0)
 x=replicate(n,rnorm(m))
 f=sample(1:k, size=m, replace=T)

 system.time(split(as.data.frame(x),f))
   user  system elapsed
 39.020   0.010  39.084

 v=lapply(
+ 1:dim(x)[[2]]
+ , function(i) {
+   split(x[,i],f)
+ }
+ )

 system.time(lapply(
+ 1:dim(x)[[2]]
+ , function(i) {
+   split(x[,i],f)
+ }
+ )
+ )
   user  system elapsed
  2.520   0.000   2.526

 system.time(
+ mapply(
+ function(...) {
+   cbind(...)
+ }
+ , v[[1]], v[[2]], v[[3]], v[[4]], v[[5]], v[[6]]
+ )
+ )
   user  system elapsed
  0.920   0.000   0.927




 David.

 system.time(split(as.data.frame(x),f))

  user  system elapsed
  1.700   0.010   1.786

 system.time(lapply(

 +         1:dim(x)[[2]]
 +         , function(i) {
 +           split(x[,i],f)
 +         }
 +         )
 +     )
  user  system elapsed
  0.170   0.000   0.167

 ###
 m=3
 n=6
 k=3000

 set.seed(0)
 x=replicate(n,rnorm(m))
 f=sample(1:k, size=m, replace=T)

 system.time(split(as.data.frame(x),f))

 system.time(lapply(
      1:dim(x)[[2]]
      , function(i) {
        split(x[,i],f)
      }
      )
  )

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read 

Re: [R] equivalent of ifelse

2009-12-09 Thread Duncan Murdoch

On 09/12/2009 12:40 PM, carol white wrote:

Hi,
Is there any equivalent for ifelse (except if (cond) expr1 else expr2) which 
takes an atomic element as argument but returns vector since ifelse returns an 
object of the same length as its argument?
  


I don't understand what's wrong with  if (cond) expr1 else expr2.  It 
can be used in an expression, e.g.


w - if (z = 3) x else y

which is I think exactly what you are asking for.

Duncan Murdoch

x = c(1,2,3)
y = c(4,5,6,7)
z = 3

ifelse(z = 3,x,y)

would return x and not 1
  
thanks


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Time Series Rating Model

2009-12-09 Thread ryusuke

http://n4.nabble.com/file/n956255/TimeSeries%2BLikelihood.jpg 

From the attachment, I need to find out the maximum likelihood value of ξ,
while I stuck to write in R codes. I will appreciate of experts advices.
Thanks.



ryusuke wrote:
 
 
 To R programming experts,
 
 I am a undergraduate student, and now doing research personally. I apply
 diagonal bivariate poisson (R package bivpois) with stochatics weighted
 function (refer to dixoncoles97 section 4.5 to 4.7). However I dont know
 how to fit this stochatical weighted function to the completed bivariate
 poisson model.
 
 I know that some other references for dynamic soccer team rating apply
 below two methods, while I am not familiar with these method:-
 1. Brownian Motions
  ifs (CRAN package)
  sde (CRAN package)
  dvfBm (CRAN package)
  
  2. Kalman Filters
  FKF (CRAN package)
  KFAS (CRAN package)
 
 Hereby I attach some references. I upload the model in R programming file
 model.RData in my skydrive as well.
 
 I will appreciate if prof would sharing your precious advice or
 suggestion. Thank you.
 
 
 Best Regards,
 
 Ryusuke
 A Soccer Scores Modelling Enthursiast
 
 
 _
 USBメモリ代わりにお使いください。無料で使える25GB。
 
  
  
  
  
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://n4.nabble.com/Time-Series-Rating-Model-tp930676p956255.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] equivalent of ifelse

2009-12-09 Thread Márcio Resende



David Winsemius wrote:
 
 
 On Dec 9, 2009, at 12:40 PM, carol white wrote:
 
 Hi,
 Is there any equivalent for ifelse (except if (cond) expr1 else  
 expr2) which takes an atomic element as argument but returns vector  
 since ifelse returns an object of the same length as its argument?

 x = c(1,2,3)
 y = c(4,5,6,7)
 z = 3

 ifelse(z = 3,x,y)

 would return x and not 1
 
 I worry that this is too simple, so wonder if you have expressed your  
 intent clearly.
 
   if(z = 3) {x} else {y}
 [1] 1 2 3

 
 I was wondering David, why is the {} necessary?
 if(z = 3) x else y 
 [1] 1 2 3
 
 since without {} it cames with the same result?
 
 Thanks
 MR.
 
 
 
 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://n4.nabble.com/equivalent-of-ifelse-tp956232p956258.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] arrow plots

2009-12-09 Thread Cable, Samuel B Civ USAF AFMC AFRL/RVBXI
Thanks, all, for the help.  Much obliged.  I realize now that I should
have said that I am using lattice graphics.  The par() command has not
been helpful in convincing lattice to plot outside of the default
window.  Any other advice is appreciated.  Thanks again.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep() exclude certain patterns?

2009-12-09 Thread Greg Snow
I think that we are talking past each other here.  You are clearly not 
understanding (or at least convinced by) what I am saying, and you are not 
convincing me (or possibly I am not understanding your arguments).  So our 
efforts will probably be better spent on things other than continuing this 
discussion.  After a few general comments on why I like where R is and the 
direction it is going, this is likely to be my last contribution to this 
conversation.

You have expressed interest in books in the past, as the free documentation has 
not been sufficient for you, you may be interested in some of the books listed 
here: http://www.r-project.org/doc/bib/R-books.html

Some that may fit your needs (others may as well, but I have not read 
everything on the list) include:
S Programming
An R and S-Plus Companion to Applied Regression
Modern Applied Statistics with S
A Handbook of Statistical Analyses Using R  (this one has the 1st chapter and 
all example available for free).


You say that R is written by statisticians rather than software engineers.  10 
years ago I was in the last year of my graduate program and my job at the time 
had me traveling and interacting with a lot of people who had purchased and 
were using the same commercial package as I was.  These were both stats 
professors and professional statisticians.  Some of the discussions were about 
some added functionality that we wished for.  When I tried to pass these 
suggestions on to one of the programmers (CS graduate) of the commercial 
package, he proceeded to give me a lecture on what the users really wanted.

I'll take the package created by Statisticians for Statistician over the ones 
by Software Engineers who don't work in the field.

You compared R to C++ a couple of times, but this is not really a valid 
comparison.  C++ was never intended to be used interactively, S/R had this as a 
core concept from the beginning.  As a slightly better comparison, one previous 
job that I had I was working with programs written in C, when I switched to 
using Perl (another language famous for and proud of nonstandard function calls 
and inconsistencies) my productivity doubled.  This is not saying that C is 
bad, I still use it where appropriate, just that Perl was better for that job.  
I can see how someone could program a t-test in C++, but R would be a lot 
quicker, on the other hand, I would not choose R as the programming language if 
I were creating the a full accounting system, the next word processor, and 
improved spreadsheet, or the next hot game (though I am guilty of programming 
games in R).

R is not perfect, if it were, there would not be all the new releases.  But I 
am happy with it and the direction it is going.  You would like more structure, 
more standards committees, etc.  Here is one example of why I don't like the 
idea of those things.  A couple of years ago I posted to this list with a 
question about something I was trying to do, I included an example of what I 
had tried, what I was trying to accomplish, and how the results differed from 
what I wanted.  My post appeared on Friday.  On Saturday a member of R core 
responded that the functions I was using were never intended to work the way 
that I was trying to make them work, and it was unlikely that I would ever get 
them to work that way.  He did however mention that he could see a possibility 
of a new function that did what I wanted.  On Sunday another person replied and 
said they would also be interested in the new function.  On Monday, the member 
of R core wrote again saying that he had just committed!
  the new function, which did exactly what I asked for, to the development 
version of R.  Contrast that with the last time I contacted tech support for a 
commercial package that I was paying maintenance fees for, it took them longer 
than that to get back to me with their first answer, which did not even work, 
and even longer to get back with a working answer that turned out to be more 
complicated than what I had worked out for myself in the meantime.

So, I for one am very happy with R and the direction it is going.  I am 
grateful to R core and all the others who are improving this great program.  
And I am trying to do my part in improving it.  

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] choose.files limit?

2009-12-09 Thread Etienne Bellemare Racine
Gunnar,

Did you find a solution, I'm facing the same problem (would like to load 
2868 files in one go). Is it a bug, or just something that should be 
documented in the help. Also, the error message is misleading, as it say 
it «cannot find» a certain file (the name given here is truncated in my 
case), instead of «too much files selected».

I'm not really familiar with bug filling, could someone with more 
experience tell if it's really a bug ? It's easy to reproduce, just 
select a lot of files (I couldn't find the exact number unlike Gunnar) 
with choose.files().

Etienne

Gunnar W. Schade a écrit :
 Howdy,

 When I use the choose.files command to read files a large number of 
 file names to a character vector inside a function, used to access 
 these files one after the other, there appears to be a limit. I do not 
 know whether it is arbitrary, but in this case the limit was 991 
 files. The file names are long. Does that matter? The error message 
 that appears says that it cannot find file #992 (giving the file 
 name), although it is certainly there (I tried changing which file is 
 #992 and it did not matter).

 Suggestions?

 - Gunnar


 ---
 Dr. Gunnar W. Schade
 Assistant Professor
 Texas AM University
 Department of Atmospheric Sciences
 1104 Eller OM Building
 3150 TAMU
 College Station, TX 77843-3150
 USA

 ph.: 979 845 0633
 Fax: 979 862 4466

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] code

2009-12-09 Thread Greg Snow
There is an hpd function in the TeachingDemos package that may do what you 
want.  There are other hpd related functions in other packages as well.  Which 
will work best for you depends on details that you did not provide.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Rahim Alhamzawi
 Sent: Wednesday, December 09, 2009 3:24 AM
 To: r-h...@stat.math.ethz.ch
 Subject: [R] code
 
 Dear support,
 I want to compute the highest probability density for any data. please
 do you have any code to help me in this subject.
 I am looking forward to hearing from you as soon as possible.
 
 rahim
 
 
 
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] .Rhistory in R.app

2009-12-09 Thread Maria Gouskova
Dear R users,

I am having a minor but annoying issue with R.app. It doesn't retain
the history information from the previous sessions. By history, I
mean a record of commands/functions entered into R rather than the
list of objects--that is properly recorded in the .Rdata file as well
as in a workspace file I save separately.

System details:

R version 2.9.0
R.app GUI 1.28
Mac OS 10.6.2 (MacBook, Intel 2.4 GHz, 4 Gb RAM)

Things I've done:

RPreferencesStartupHistory: Read history file on startup is
checked; R history file directory is specified with a path to my
preferred directory (~/Documents/...). I've tried it with the default
setting, too--it makes no difference.

I've checked the permissions on the .Rhistory file. The default
.Rhistory file created by R has the permissions set at -rw-r--r--.

I've moved the .Rhistory file to a different location (Desktop), so
that R would create a new one. Makes no difference--command history is
still empty at startup.

R has kept track of history on my system in the past--the file I moved
to the desktop has a record of my work from about a year ago. (By the
way, that file's permissions are -rwx-.) Judging by what is in the
old .Rhistory file, the problem started around the time of my upgrade
from 2.7.x to 2.8. I am reluctant to upgrade to R 2.10 in the middle
of a project, because every R upgrade I've done in the past has broken
something, and I've had nothing but grief with my open source apps
after upgrading to Snow Leopard. So if there is some kind of a fix
that doesn't involve upgrading R, I'd love to hear about it.

Maria Gouskova

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Population Histogram

2009-12-09 Thread terry johnson
How would I make a population histogram in R from an excel file? Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can elements of a list be passed as multiple arguments?

2009-12-09 Thread David Winsemius


On Dec 9, 2009, at 12:14 PM, Peng Yu wrote:

On Tue, Dec 8, 2009 at 11:05 PM, David Winsemius dwinsem...@comcast.net 
 wrote:


On Dec 8, 2009, at 11:37 PM, Peng Yu wrote:


I want to split a matrix, where both 'u' and 'w' are results of
possible ways. However, whenever 'n' changes, the argument passed to
mapply() has to change. Is there a way to pass elements of a list as
multiple arguments?


You need to explain what you want in more detail. In your example  
mapply did

exactly what you told it to. No errors. Three matrices. What were you
expecting when you gave it three lists in each argument?


I want a general solution so that I don't have to always write
v[[1]], v[[2]], ..., v[[n]] like in the following, because the
following way would not work if 'n' is an arbitrary number.

w=mapply(function(x,y) {cbind(x,y)}, v[[1]], v[[2]], ..., v[[n]])

One way that I can think of is to somehow expand a list (i.e., v in
this case) to a set of arguments that can be passed to 'mapply()'.


The functions illustrated on the help page for Reduce address the task  
of passing arbitrarily long lists of arguments to functions expecting  
two. It's possible that do.call might address this, but I have not  
come up with a strategy that deals with your structures.


--
David.





m=10
n=2
k=3

set.seed(0)
x=replicate(n,rnorm(m))
f=sample(1:k, size=m, replace=T)

u=split(as.data.frame(x),f)

v=lapply(
  1:dim(x)[[2]]
  , function(i) {
split(x[,i],f)
  }
  )

w=mapply(
  function(x,y) {
cbind(x,y)
  }
  , v[[1]], v[[2]]
  )


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset sum problem.

2009-12-09 Thread Hans W Borchers
Geert Janssens janssens-geert at telenet.be writes:

 
 Hi,
 
 I'm quite new to the R-project. I was suggested to look into it because I am 
 trying to solve the Subset sum problem, which basically is:
 Given a set of integers and an integer s, does any non-empty subset sum to s?
 (See http://en.wikipedia.org/wiki/Subset_sum_problem)
 
 I have been searching the web for quite some time now (which is how I 
 eventually discovered that my problem is called subset sum), but I can't seem 
 to find an easily applicable implementation. I did search the list archive, 
 the R website and used the help.search and apropos function. I'm afraid 
 nothing obvious showed up for me.
 
 Has anybody tackled this issue before in R ? If so, I would be very grateful 
 if you could share your solution with me.


Is it really true that you only want to see a Yes or No answer to this
question whether a subset sums up to s --- without learning which numbers
this subset is composed of (the pure SUBSET SUM problem)?
Then the following procedure does that in a reasonable amount of time
(returning 'TRUE' or 'FALSE' instead of Y-or-N):

# Exact algorithm for the SUBSET SUM problem
exactSubsetSum - function(S, t) {
  S - S[S = t]
  if (sum(S)  t) return(FALSE)
  S - sort(S, decreasing=TRUE)
  n - length(S)
  L - c(0)
  for (i in 1:n) {
L - unique(sort(c(L, L + S[i])))
L - L[L = t]
if (max(L) == t) return(TRUE)
  }
  return(FALSE)
}

# Example with a set of cardinality 64
amount - 4748652
products - 
c(30500,30500,30500,30500,42000,42000,42000,42000,
  42000,42000,42000,42000,42000,42000,71040,90900,
  76950,35100,71190,53730,456000,70740,70740,533600,
  83800,59500,27465,28000,28000,28000,28000,28000,
  26140,49600,77000,123289,27000,27000,27000,27000,
  27000,27000,8,33000,33000,55000,77382,48048,
  51186,4,35000,21716,63051,15025,15025,15025,
  15025,80,111,59700,25908,829350,1198000,1031655)

# Timing is not that bad
system.time( sol - exactSubsetSum(products, amount) )
#  user  system elapsed 
# 0.516   0.096   0.673 
sol
# [1] TRUE

 Thank you very much.
 
 Geert


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] .Rhistory in R.app

2009-12-09 Thread Rolf Turner


I just experimented and I find that, exactly as Maria described,
on my system no commands get added to .Rhistory when I start R using  
the GUI.


The ``timestamp'' that is implemented in my .Rprofile gets added, but  
no commands

that I typed in the GUI window appeared.

I had never noticed this before since I never actually use the GUI.   
(Like ***all

civilized*** people, I use the command line exclusciously. :-) )

I am running R 2.10.0, so updating is not the issue.

Maria:  Why don't you just start R from the command line, like a  
civilized

person ( :-) ) and forget about the expletive deleted GUI, which only
gets in the way of serious work?

cheers,

Rolf Turner

P. S.:

  sessionInfo()
R version 2.10.0 (2009-10-26)
i386-apple-darwin8.11.1

locale:
[1] en_NZ.UTF-8/en_NZ.UTF-8/C/C/en_NZ.UTF-8/en_NZ.UTF-8

attached base packages:
[1] datasets  utils stats graphics  grDevices methods   base

other attached packages:
[1] misc_0.0-11fortunes_1.3-6 MASS_7.3-3

On 10/12/2009, at 7:33 AM, Maria Gouskova wrote:


Dear R users,

I am having a minor but annoying issue with R.app. It doesn't retain
the history information from the previous sessions. By history, I
mean a record of commands/functions entered into R rather than the
list of objects--that is properly recorded in the .Rdata file as well
as in a workspace file I save separately.

System details:

R version 2.9.0
R.app GUI 1.28
Mac OS 10.6.2 (MacBook, Intel 2.4 GHz, 4 Gb RAM)

Things I've done:

RPreferencesStartupHistory: Read history file on startup is
checked; R history file directory is specified with a path to my
preferred directory (~/Documents/...). I've tried it with the default
setting, too--it makes no difference.

I've checked the permissions on the .Rhistory file. The default
.Rhistory file created by R has the permissions set at -rw-r--r--.

I've moved the .Rhistory file to a different location (Desktop), so
that R would create a new one. Makes no difference--command history is
still empty at startup.

R has kept track of history on my system in the past--the file I moved
to the desktop has a record of my work from about a year ago. (By the
way, that file's permissions are -rwx-.) Judging by what is in the
old .Rhistory file, the problem started around the time of my upgrade
from 2.7.x to 2.8. I am reluctant to upgrade to R 2.10 in the middle
of a project, because every R upgrade I've done in the past has broken
something, and I've had nothing but grief with my open source apps
after upgrading to Snow Leopard. So if there is some kind of a fix
that doesn't involve upgrading R, I'd love to hear about it.

Maria Gouskova


##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] .Rhistory in R.app

2009-12-09 Thread Rob Goedman
Maria,

Try changing the name of .Rhistory in the Startup preferences to something like 
.Rosxhistory. Press enter to make sure the change is accepted and try again.
The problem is that R itself overwrites the file .Rhistory if it is told to 
save the workspace.

Rob



On Dec 9, 2009, at 10:33 AM, Maria Gouskova wrote:

 Dear R users,
 
 I am having a minor but annoying issue with R.app. It doesn't retain
 the history information from the previous sessions. By history, I
 mean a record of commands/functions entered into R rather than the
 list of objects--that is properly recorded in the .Rdata file as well
 as in a workspace file I save separately.
 
 System details:
 
 R version 2.9.0
 R.app GUI 1.28
 Mac OS 10.6.2 (MacBook, Intel 2.4 GHz, 4 Gb RAM)
 
 Things I've done:
 
 RPreferencesStartupHistory: Read history file on startup is
 checked; R history file directory is specified with a path to my
 preferred directory (~/Documents/...). I've tried it with the default
 setting, too--it makes no difference.
 
 I've checked the permissions on the .Rhistory file. The default
 .Rhistory file created by R has the permissions set at -rw-r--r--.
 
 I've moved the .Rhistory file to a different location (Desktop), so
 that R would create a new one. Makes no difference--command history is
 still empty at startup.
 
 R has kept track of history on my system in the past--the file I moved
 to the desktop has a record of my work from about a year ago. (By the
 way, that file's permissions are -rwx-.) Judging by what is in the
 old .Rhistory file, the problem started around the time of my upgrade
 from 2.7.x to 2.8. I am reluctant to upgrade to R 2.10 in the middle
 of a project, because every R upgrade I've done in the past has broken
 something, and I've had nothing but grief with my open source apps
 after upgrading to Snow Leopard. So if there is some kind of a fix
 that doesn't involve upgrading R, I'd love to hear about it.
 
 Maria Gouskova
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Significant performance difference between split of a data.frame and split of vectors

2009-12-09 Thread Peng Yu
On Tue, Dec 8, 2009 at 10:37 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Dec 8, 2009, at 11:28 PM, Peng Yu wrote:

 I have the following code, which tests the split on a data.frame and
 the split on each column (as vector) separately. The runtimes are of
 10 time difference. When m and k increase, the difference become even
 bigger.

 I'm wondering why the performance on data.frame is so bad. Is it a bug
 in R? Can it be improved?

 You might want to look at the data.table package. The author calinms
 significant speed improvements over dta.frames

'data.table' doesn't seem to help. You can try the other set of m,n,k.
In both case, using as.data.frame is faster than using as.data.table.

Please let me know if I understand what you meant.

 m=10
 n=6
 k=3

 #m=30
 #n=6
 #k=3

 set.seed(0)
 x=replicate(n,rnorm(m))
 f=sample(1:k, size=m, replace=T)

 library(data.table)
Loading required package: ref
dim(refdata) and dimnames(refdata) no longer allow parameter ref=TRUE,
use dim(derefdata(refdata)), dimnames(derefdata(refdata)) instead
 system.time(split(as.data.frame(x),f))
   user  system elapsed
  0.000   0.000   0.003
 system.time(split(as.data.table(x),f))
   user  system elapsed
  0.010   0.000   0.011

 system.time(split(as.data.frame(x),f))

  user  system elapsed
  1.700   0.010   1.786

 system.time(lapply(

 +         1:dim(x)[[2]]
 +         , function(i) {
 +           split(x[,i],f)
 +         }
 +         )
 +     )
  user  system elapsed
  0.170   0.000   0.167

 ###
 m=3
 n=6
 k=3000

 set.seed(0)
 x=replicate(n,rnorm(m))
 f=sample(1:k, size=m, replace=T)

 system.time(split(as.data.frame(x),f))

 system.time(lapply(
       1:dim(x)[[2]]
       , function(i) {
         split(x[,i],f)
       }
       )
   )

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why cannot get the expected values in my function

2009-12-09 Thread rusers.sh
Thanks very very much. It is really not easy to change from one language to
another. :)

2009/12/9 Gavin Simpson gavin.simp...@ucl.ac.uk

 On Tue, 2009-12-08 at 23:22 -0500, David Winsemius wrote:
  On Dec 8, 2009, at 11:07 PM, rusers.sh wrote:
 
   Hi,
In the following function, i hope to save my simulated data into the
   result dataset, but why the final result dataset seems not to be
   generated.
 
 
   #Function
   simdata-function (nsim) {
 
  # Instead why not:
  cbind(x=runif(nsim), y=runif(nsim) )

 or:

 m - matrix(runif(nsim*2), ncol = 2)
 ## if names on m needed
 colnames(m) - c(x,y)

 G

}
  
   #simulation
   simdata(10)  #correct result
x   y
   [1,] 0.2655087 0.372123900
   [2,] 0.1848823 0.702374036
   [3,] 0.1680415 0.807516399
   [4,] 0.5858003 0.008945796
   [5,] 0.2002145 0.685218596
   [6,] 0.6062683 0.937641973
   [7,] 0.9889093 0.397745453
   [8,] 0.4662952 0.207823317
   [9,] 0.2216014 0.024233910
   [10,] 0.5074782 0.306768506
But, the dataset result wasnot assigned the above values. What is
   the
   problem?
   result  #wrong result??
 x  y
   [1,] NA NA
   [2,] NA NA
   [3,] NA NA
   [4,] NA NA
   [5,] NA NA
   [6,] NA NA
   [7,] NA NA
   [8,] NA NA
   [9,] NA NA
   [10,] NA NA
  
   Thanks a lot.
   --
   -
   Jane Chang
   Queen's
  
   [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
 
  David Winsemius, MD
  Heritage Laboratories
  West Hartford, CT
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 --
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
  Dr. Gavin Simpson [t] +44 (0)20 7679 0522
  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
  Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%




-- 
-
Jane Chang
Queen's

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] formula () problems

2009-12-09 Thread Dr R.K.S. Hankin

Hi.

I am having difficulty creating a formula for use with glm()

I have a matrix of an unknown number of columns and wish to estimate a
coefficient for each column, and one for the each product of a column
with another column.

In the case of a five-column matrix this would be:


x - matrix(rnorm(100),ncol=5)
colnames(x) - letters[1:5]
z - rnorm(20)
lm(z~ -1+(a+b+c+d+e)^2,data=data.frame(x))


Call:
lm(formula = z ~ -1 + (a + b + c + d + e)^2, data = data.frame(x))

Coefficients:
  a b c d e a:b a:c a:d -0.30021 -0.21465 0.12208 0.06308 0.28806 
0.34482 -1.00072 0.48218
a:e   b:c   b:d   b:e   c:d   c:e   d:e  
0.28786  -0.46306   0.39844   0.04436   0.32236  -0.09210  -1.06625  





This is what I want: five single terms (a-e) and 5*(5-1)/2=10 (a:b to
d:e) for the cross terms.  If there were 6 columns I would want
(a+b+c+d+e+f)^2 and have 21 (=6+15) terms.

How do I create a formula that does this for an arbitrary number of columns?


thanks

Robin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] formula () problems

2009-12-09 Thread Marc Schwartz

On Dec 9, 2009, at 2:22 PM, Dr R.K.S. Hankin wrote:


Hi.

I am having difficulty creating a formula for use with glm()

I have a matrix of an unknown number of columns and wish to estimate a
coefficient for each column, and one for the each product of a column
with another column.

In the case of a five-column matrix this would be:


x - matrix(rnorm(100),ncol=5)
colnames(x) - letters[1:5]
z - rnorm(20)
lm(z~ -1+(a+b+c+d+e)^2,data=data.frame(x))


Call:
lm(formula = z ~ -1 + (a + b + c + d + e)^2, data = data.frame(x))

Coefficients:
 a b c d e a:b a:c a:d -0.30021 -0.21465 0.12208 0.06308 0.28806  
0.34482 -1.00072 0.48218
   a:e   b:c   b:d   b:e   c:d   c:e   d:e   
0.28786  -0.46306   0.39844   0.04436   0.32236  -0.09210  -1.06625


This is what I want: five single terms (a-e) and 5*(5-1)/2=10 (a:b to
d:e) for the cross terms.  If there were 6 columns I would want
(a+b+c+d+e+f)^2 and have 21 (=6+15) terms.

How do I create a formula that does this for an arbitrary number of  
columns?



thanks

Robin



Robin,

Try this:

  lm(z ~ (.)^2 - 1, data = data.frame(x))

See the Details section of ?formula, which describes the use of '.' to  
refer to all columns not otherwise already in the formula.


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Significant performance difference between split of a data.frame and split of vectors

2009-12-09 Thread David Winsemius


On Dec 9, 2009, at 2:59 PM, Peng Yu wrote:

On Tue, Dec 8, 2009 at 10:37 PM, David Winsemius dwinsem...@comcast.net 
 wrote:


On Dec 8, 2009, at 11:28 PM, Peng Yu wrote:


I have the following code, which tests the split on a data.frame and
the split on each column (as vector) separately. The runtimes are of
10 time difference. When m and k increase, the difference become  
even

bigger.

I'm wondering why the performance on data.frame is so bad. Is it a  
bug

in R? Can it be improved?


You might want to look at the data.table package. The author calinms
significant speed improvements over dta.frames


'data.table' doesn't seem to help. You can try the other set of m,n,k.
In both case, using as.data.frame is faster than using as.data.table.

Please let me know if I understand what you meant.


I was only suggesting that you look at it because it appeared in other  
situation to have efficiency advantages. As it turned out, that  
structure offered no advantage, when I tested it.


--
David.





m=10
n=6
k=3

#m=30
#n=6
#k=3

set.seed(0)
x=replicate(n,rnorm(m))
f=sample(1:k, size=m, replace=T)

library(data.table)

Loading required package: ref
dim(refdata) and dimnames(refdata) no longer allow parameter ref=TRUE,
use dim(derefdata(refdata)), dimnames(derefdata(refdata)) instead

system.time(split(as.data.frame(x),f))

  user  system elapsed
 0.000   0.000   0.003

system.time(split(as.data.table(x),f))

  user  system elapsed
 0.010   0.000   0.011


system.time(split(as.data.frame(x),f))


 user  system elapsed
 1.700   0.010   1.786


system.time(lapply(


+ 1:dim(x)[[2]]
+ , function(i) {
+   split(x[,i],f)
+ }
+ )
+ )
 user  system elapsed
 0.170   0.000   0.167

###
m=3
n=6
k=3000

set.seed(0)
x=replicate(n,rnorm(m))
f=sample(1:k, size=m, replace=T)

system.time(split(as.data.frame(x),f))

system.time(lapply(
  1:dim(x)[[2]]
  , function(i) {
split(x[,i],f)
  }
  )
  )

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] new version of RobASt-family of packages

2009-12-09 Thread Matthias Kohl
The new version 0.7 of our RobASt-family of packages is available on 
CRAN for several days. As there were many changes we will only sketch 
the most important ones here. For more details see the corresponding 
NEWS files (e.g. news(package = RobAStBase) or using function NEWS 
from package startupmsg i.e. NEWS(RobAStBase)).



First of all the new package RobLoxBioC was added to the family which 
includes S4 classes and methods for preprocessing omics data, in 
particular gene expression data.



##
## All packages (RandVar, RobAStBase, RobLox, RobLoxBioC,
## ROptEst, ROptEstOld, ROptRegTS, RobRex)
##
- TOBEDONE file was added as a starting point for collaborations. The
file can be displayed via function TOBEDONE from package startupmsg;
e.g. TOBEDONE(distr)
- tests/Examples folder for some automatic testing was introduced


##
## Package RandVar
##
- mainly fixing of warnings and bugs


##
## Package RobAStBase
##
- enhanced plotting, in particular methods for qqplot
- unified treatment of NAs
- extended implementation for total variation neighbourhoods
- implementation of k-step estimator construction extended


##
## Package RobLox
##
- na.rm argument added
- introduction of finite-sample correction


##
## Package ROptEst
##
- optional use of alternative algorithm to obtain Lagrange multipliers
 using duality based optimization
- extended implementation for total variation neighbourhoods
- solutions for general parameter transformations with nuisance
 components
- several extensions to the examples in folder scripts
- implementation of k-step estimator construction extended


##
## Package ROptEstOld
##
- still needed for packages ROptRegTS and RobRex
- removed Symmetry and DistributionSymmetry implementation to
 make ROptEstOld compatible with distr 2.2


##
## Package ROptRegTS
##
- still depends on ROptEstOld


##
## Packages RobRex
##
- moved some of the examples in \dontrun{} to reduce check time ...
- some minor corrections in ExamplesEstimation.R in folder scripts


Best
Peter
Matthias

--
Dr. Matthias Kohl
www.stamats.de

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Significant performance difference between split of a data.frame and split of vectors

2009-12-09 Thread Charles C. Berry

On Wed, 9 Dec 2009, Peng Yu wrote:


On Wed, Dec 9, 2009 at 11:20 AM, Charles C. Berry cbe...@tajo.ucsd.edu wrote:

On Wed, 9 Dec 2009, Peng Yu wrote:


On Tue, Dec 8, 2009 at 11:06 PM, David Winsemius dwinsem...@comcast.net
wrote:


On Dec 9, 2009, at 12:00 AM, Peng Yu wrote:


On Tue, Dec 8, 2009 at 10:37 PM, David Winsemius
dwinsem...@comcast.net
wrote:


On Dec 8, 2009, at 11:28 PM, Peng Yu wrote:


I have the following code, which tests the split on a data.frame and
the split on each column (as vector) separately. The runtimes are of
10 time difference. When m and k increase, the difference become even
bigger.

I'm wondering why the performance on data.frame is so bad. Is it a bug
in R? Can it be improved?


You might want to look at the data.table package. The author calinms
significant speed improvements over dta.frames


This bug has been found long time back and a package has been
developed for it. Should the fix be integrated in data.frame rather
than be implemented in an additional package?


What bug?


Is the slow speed in splitting a data.frame a performance bug?



NO!

The two computations are not equivalent.

One is a list whose elements are split vectors, and the other is a list of
data.frames containing those vectors.


I made a comparable example below. Still splitting data.frame is much
slower comparing with the second way that I'm showing.


If you take the trouble to assemble that list of data frames from the list
of split vectors you will see that it is very time consuming.


It is not as I show in the example below.



You are comparing creating a matrix to creating a data.frame.



system.time(

+  spl-   mapply(
+ function(...) {
+   cbind(...)
+ }
+ , v[[1]], v[[2]], v[[3]], v[[4]], v[[5]], v[[6]]
+ )
+ )
   user  system elapsed
  1.204   0.016   1.478


system.time(
+  spl-   mapply(
+ function(...) {
+   data.frame(...)
+ }
+ , v[[1]], v[[2]], v[[3]], v[[4]], v[[5]], v[[6]],SIMPLIFY=FALSE
+ )
+ )
   user  system elapsed
 56.088   0.104  56.478





If you just want a list of matrices, use


system.time(split.data.frame(x,f))

   user  system elapsed
  0.524   0.016   0.927





Read up on memory management issues. Think about what the computer actually
has to do in terms of memory access to split a data.frame versus split a
vector.


I'd like to read more on how R do memory management. Would you please
point me a good source?


I see now that the timing issue was not one of memory, but of doing more 
work (see Rprof results below) to create a data.frame. But if you are 
interested you might look at


Golub, Gene H.; Van Loan, Charles F. (1996), Matrix Computations (3rd 
ed.), Johns Hopkins, ISBN 978-0-8018-5414-9 .


and/or Google BLAS memory




But again, R is not user friendly. It took me quite a long time to
figure out that splitting a data.frame is a bottle neck in my program
and reduce the problem into a test case.



See

?Rprof

and note where the 'self.time's are largest below( not in split or 
split.data.frame) :



Rprof()
res - split(as.data.frame(x),f)
Rprof(NULL)
summaryRprof()

$by.self
self.time self.pct total.time total.pct
attr  33.66 72.9  33.66  72.9
[.data.frame   3.26  7.1  45.70  98.9
inherits   1.52  3.3   2.06   4.5
anyDuplicated  1.04  2.3   1.42   3.1
[[.data.frame  1.00  2.2   4.76  10.3
[[ 0.74  1.6   5.50  11.9
match  0.66  1.4   2.96   6.4
Anonymous0.66  1.4   0.72   1.6
sys.call   0.46  1.0   0.46   1.0
all0.38  0.8   0.38   0.8
anyDuplicated.default  0.36  0.8   0.38   0.8
%in%   0.32  0.7   3.26   7.1
names  0.26  0.6   0.26   0.6
is.factor  0.24  0.5   2.30   5.0
length 0.20  0.4   0.20   0.4
attr- 0.18  0.4   0.18   0.4
as.character   0.16  0.3   0.16   0.3
[  0.14  0.3  45.84  99.2
-  0.14  0.3   0.14   0.3
!  0.12  0.3   0.12   0.3
.Call  0.12  0.3   0.12   0.3
!= 0.10  0.2   0.10   0.2
vector 0.06  0.1   0.26   0.6
as.data.frame.matrix   0.06  0.1   0.08   0.2
|  0.06  0.1   0.06   0.1
lapply 0.04  0.1  46.12  99.8
  0.04  0.1   0.04   0.1
any0.04  0.1   0.04   0.1
is.na  

[R] [R-pkgs] doMPI 0.1-3

2009-12-09 Thread Stephen Weston
I'd like to announce the availability of the new doMPI package, a
parallel backend for the foreach package, which acts as an adaptor to
the Rmpi package.  The package has been uploaded to CRAN and is now
available.

Like the doSNOW package, doMPI allows you to execute foreach loops
in parallel using Rmpi as the underlying transport.  But I was
interested in experimenting with using Rmpi directly so that data that
was used in all iterations of a foreach loop could be broadcast to the
cluster workers using the Rmpi mpi.bcast function.  I also wanted to
write the package so it could fetch arguments and process results
dynamically, allowing it to handle an arbitrary number of tasks in a
memory efficient way.

The package includes a number of example scripts and an introductory
vignette, in addition to the standard help documentation.  The vignette
also attempts to explain how to run doMPI scripts using the Open MPI
orterun command, which I hope helps people who are new to Rmpi get
started running parallel R programs.

- Steve Weston

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] re-ordering labels issue using barchart().

2009-12-09 Thread Peng Cai
Hi All,

I have a question regarding re-ordering legend labels, I'm not sure if it
would be possible to do what I'm thinking...

Using the following code, I reordered labels for the legend:

Now in the stacked bars, we see 1931 values are plotted below 1932 whereas
in legend 1931 comes above 1932. So in the process if we look at stacked
bars, color blue is below pink whereas at the same time if one looks at
the legends its the other way around.

Is there way to re-order legends in a way that colors look in same order?

Code:
library(lattice)
barley$year - factor(barley$year, levels=c(1931,1932))
barchart(yield ~ variety | site, data = barley,
  groups = year, layout = c(1,6), stack = TRUE,
  auto.key = list(points = FALSE, rectangles = TRUE, space =
right),
  ylab = Barley Yield (bushels/acre),
  scales = list(x = list(rot = 45)))

Thanks alot,
Peng Cai

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: Evaluating a function within a pre-defined environment?

2009-12-09 Thread David Reiss
Hi all,

I have a somewhat confusing question that I was wondering if someone
could help with. I have a pre-defined environment with some variables,
and I would like to define a function, such that when it is called, it
actually manipulates the variables in that environment, leaving them
to be examined later. I see from the R language definition that

When a function is called, a new environment (called the evaluation
environment) is created, whose enclosure (see Environment objects) is
the environment from the function closure. This new environment is
initially populated with the unevaluated arguments to the function; as
evaluation proceeds, local variables are created within it.

So basically, I think I am asking if it is possible to pre-create my
own evaluation environment and have it retain the state that it was
in at the end of the function call?

Example:

e - new.env()
e$x - 3
f - function(xx) x - x + xx

can I then call f(2) and have it leave e$x at 5 after the function
returns? I know that

environment(f) - e

goes part of the way, but I would like to let the function also write
to the environment.

Thanks for any advice.

--David

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What is the development cycle where there are code in tests/ for package development?

2009-12-09 Thread Peng Yu
I see 'library(stats)' at the beginning of
R-2.10.0/src/library/stats/tests/nls.R.

I'm wondering if I am developing my own package 'mypackage' whether I
should put 'library(mypackage)' in a .R file in mypackage/tests/? If I
do, then it seems awkward to me, because to use 'library(mypackage)',
I have to first get 'mypackage' installed.

So the development cycle is: try test cases in tests- see bugs in
'mypackage' - modify the code in 'mypackage' - install
'mypackage'-try test cases in tests again

But I think it would faster if the step of installing the package is
avoid. So instead of using 'library(mypackage)', I'd think to use
'source(some_file_in_mypackage.R)' in any file in tests/. Could
somebody let me know what is the current standard way of developing
package. Why 'library(mypackage)' rather than
'source(some_file_in_mypackage.R)' is used?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Evaluating a function within a pre-defined environment?

2009-12-09 Thread Gabor Grothendieck
e - new.env()
e$x - 2
f - function(a, e) { e$x - e$x + a; e$x }
f(3, e)
e$x # 5

Another way to accomplish this is to use the proto package which puts
the whole thing into an object oriented framework.  See
http://r-proto.googlecode.com

library(proto)
p - proto(x = 2, f = function(this, a) { this$x - this$x + a; this$x })
p$f(3) # 5


On Wed, Dec 9, 2009 at 4:54 PM, David Reiss dre...@systemsbiology.org wrote:
 Hi all,

 I have a somewhat confusing question that I was wondering if someone
 could help with. I have a pre-defined environment with some variables,
 and I would like to define a function, such that when it is called, it
 actually manipulates the variables in that environment, leaving them
 to be examined later. I see from the R language definition that

 When a function is called, a new environment (called the evaluation
 environment) is created, whose enclosure (see Environment objects) is
 the environment from the function closure. This new environment is
 initially populated with the unevaluated arguments to the function; as
 evaluation proceeds, local variables are created within it.

 So basically, I think I am asking if it is possible to pre-create my
 own evaluation environment and have it retain the state that it was
 in at the end of the function call?

 Example:

 e - new.env()
 e$x - 3
 f - function(xx) x - x + xx

 can I then call f(2) and have it leave e$x at 5 after the function
 returns? I know that

 environment(f) - e

 goes part of the way, but I would like to let the function also write
 to the environment.

 Thanks for any advice.

 --David

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Evaluating a function within a pre-defined environment?

2009-12-09 Thread David Reiss
Ideally I would like to be able to use the function f (in my example)
as-is, without having to designate the environment as an argument, or
to otherwise have to use e$x in the function body.

thanks for any further advice...

On Wed, Dec 9, 2009 at 2:36 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 e - new.env()
 e$x - 2
 f - function(a, e) { e$x - e$x + a; e$x }
 f(3, e)
 e$x # 5

 Another way to accomplish this is to use the proto package which puts
 the whole thing into an object oriented framework.  See
 http://r-proto.googlecode.com

 library(proto)
 p - proto(x = 2, f = function(this, a) { this$x - this$x + a; this$x })
 p$f(3) # 5


 On Wed, Dec 9, 2009 at 4:54 PM, David Reiss dre...@systemsbiology.org wrote:
 Hi all,

 I have a somewhat confusing question that I was wondering if someone
 could help with. I have a pre-defined environment with some variables,
 and I would like to define a function, such that when it is called, it
 actually manipulates the variables in that environment, leaving them
 to be examined later. I see from the R language definition that

 When a function is called, a new environment (called the evaluation
 environment) is created, whose enclosure (see Environment objects) is
 the environment from the function closure. This new environment is
 initially populated with the unevaluated arguments to the function; as
 evaluation proceeds, local variables are created within it.

 So basically, I think I am asking if it is possible to pre-create my
 own evaluation environment and have it retain the state that it was
 in at the end of the function call?

 Example:

 e - new.env()
 e$x - 3
 f - function(xx) x - x + xx

 can I then call f(2) and have it leave e$x at 5 after the function
 returns? I know that

 environment(f) - e

 goes part of the way, but I would like to let the function also write
 to the environment.

 Thanks for any advice.

 --David

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Population Histogram

2009-12-09 Thread milton ruser
1. Read the posting guide:
http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
2. Did you installed R?
2.1. Case yes, go to the help, click on Manual in PDF

bests

miltinho

On Wed, Dec 9, 2009 at 1:47 PM, terry johnson
terry.johnson@gmail.comwrote:

 How would I make a population histogram in R from an excel file? Thanks

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset sum problem.

2009-12-09 Thread Geert Janssens
On Wednesday 9 December 2009, Hans W Borchers wrote:
 Geert Janssens janssens-geert at telenet.be writes:
  Hi,
 
  I'm quite new to the R-project. I was suggested to look into it because I
  am trying to solve the Subset sum problem, which basically is:
  Given a set of integers and an integer s, does any non-empty subset sum
  to s? (See http://en.wikipedia.org/wiki/Subset_sum_problem)
 
  I have been searching the web for quite some time now (which is how I
  eventually discovered that my problem is called subset sum), but I can't
  seem to find an easily applicable implementation. I did search the list
  archive, the R website and used the help.search and apropos function. I'm
  afraid nothing obvious showed up for me.
 
  Has anybody tackled this issue before in R ? If so, I would be very
  grateful if you could share your solution with me.

 Is it really true that you only want to see a Yes or No answer to this
 question whether a subset sums up to s --- without learning which numbers
 this subset is composed of (the pure SUBSET SUM problem)?
 Then the following procedure does that in a reasonable amount of time
 (returning 'TRUE' or 'FALSE' instead of Y-or-N):

Unfortunatly no. I do need the numbers in the subset. But thank you for 
presenting this code.

Geert

 # Exact algorithm for the SUBSET SUM problem
 exactSubsetSum - function(S, t) {
   S - S[S = t]
   if (sum(S)  t) return(FALSE)
   S - sort(S, decreasing=TRUE)
   n - length(S)
   L - c(0)
   for (i in 1:n) {
 L - unique(sort(c(L, L + S[i])))
 L - L[L = t]
 if (max(L) == t) return(TRUE)
   }
   return(FALSE)
 }

 # Example with a set of cardinality 64
 amount - 4748652
 products -
 c(30500,30500,30500,30500,42000,42000,42000,42000,
   42000,42000,42000,42000,42000,42000,71040,90900,
   76950,35100,71190,53730,456000,70740,70740,533600,
   83800,59500,27465,28000,28000,28000,28000,28000,
   26140,49600,77000,123289,27000,27000,27000,27000,
   27000,27000,8,33000,33000,55000,77382,48048,
   51186,4,35000,21716,63051,15025,15025,15025,
   15025,80,111,59700,25908,829350,1198000,1031655)

 # Timing is not that bad
 system.time( sol - exactSubsetSum(products, amount) )
 #  user  system elapsed
 # 0.516   0.096   0.673
 sol
 # [1] TRUE


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Evaluating a function within a pre-defined environment?

2009-12-09 Thread Gabor Grothendieck
You could write a wrapper function that accepts the output of the
function you don't want to change and then sets the values.

# f is function we don't want to change
f - function(a)  x + a

wrapper - function(x, e) {
   environment(f) - e
   e$x - f(x)
}

e - new.env()
e$x - 2
wrapper(3, e)
e$x # 5

or with proto:

library(proto)
p - proto(x = 2, f = f, wrapper = function(this, x) this$x -
with(this, f)(x) )
p$wrapper(3)
p$x # 5




On Wed, Dec 9, 2009 at 5:48 PM, David Reiss dre...@systemsbiology.org wrote:
 Ideally I would like to be able to use the function f (in my example)
 as-is, without having to designate the environment as an argument, or
 to otherwise have to use e$x in the function body.

 thanks for any further advice...

 On Wed, Dec 9, 2009 at 2:36 PM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 e - new.env()
 e$x - 2
 f - function(a, e) { e$x - e$x + a; e$x }
 f(3, e)
 e$x # 5

 Another way to accomplish this is to use the proto package which puts
 the whole thing into an object oriented framework.  See
 http://r-proto.googlecode.com

 library(proto)
 p - proto(x = 2, f = function(this, a) { this$x - this$x + a; this$x })
 p$f(3) # 5


 On Wed, Dec 9, 2009 at 4:54 PM, David Reiss dre...@systemsbiology.org 
 wrote:
 Hi all,

 I have a somewhat confusing question that I was wondering if someone
 could help with. I have a pre-defined environment with some variables,
 and I would like to define a function, such that when it is called, it
 actually manipulates the variables in that environment, leaving them
 to be examined later. I see from the R language definition that

 When a function is called, a new environment (called the evaluation
 environment) is created, whose enclosure (see Environment objects) is
 the environment from the function closure. This new environment is
 initially populated with the unevaluated arguments to the function; as
 evaluation proceeds, local variables are created within it.

 So basically, I think I am asking if it is possible to pre-create my
 own evaluation environment and have it retain the state that it was
 in at the end of the function call?

 Example:

 e - new.env()
 e$x - 3
 f - function(xx) x - x + xx

 can I then call f(2) and have it leave e$x at 5 after the function
 returns? I know that

 environment(f) - e

 goes part of the way, but I would like to let the function also write
 to the environment.

 Thanks for any advice.

 --David

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: conditionally merging adjacent rows in a data frame

2009-12-09 Thread Marek Janad
I've also made some comparisons and taking into account execution
time, sqldf wins. SummaryBy is better then aggregate in some specific
situations I met in practice. I present this situation below. It
assumes, that there are at least two groups with high number of
levels.

n-10;
grp1-sample(1:750, n, replace=T)
grp2-sample(1:750, n, replace=T)
d-data.frame(x=rnorm(n), y=rnorm(n), grp1=grp1, grp2=grp2, n, replace=T)

# sqldf
library(sqldf)
Rprof('prof');
sqldf(select grp1, grp2, avg(x), avg(y) from d group by grp1, grp2)
Rprof(NULL);
summaryRprof('prof')

#by
#do.call(rbind, by(d, list(d$grp1, d$grp2), function(x) transform(x, x
= mean(x), y = mean(y))[1,,drop = FALSE ]))

#doBy
library(doBy)
Rprof('prof');
summaryBy(x+y~grp1+grp2, data=d, FUN=c(mean))
Rprof(NULL);
summaryRprof('prof')

#aggregate
Rprof('prof');
aggregate(d, list(d$grp1, d$grp2), function(x)mean(x))
Rprof(NULL);
summaryRprof('prof')



-- Forwarded message --
From: Nikhil Kaza nikhil.l...@gmail.com
Date: 2009/12/9
Subject: Re: [R] conditionally merging adjacent rows in a data frame
To: Titus von der Malsburg malsb...@gmail.com
DW: r-help@r-project.org



This is great!! Sqldf is exactly the kind of thing I was looking for,
other stuff.

I suppose you can speed up both functions 1 and 5 using aggregate and
tapply only once, as was suggested earlier. But it comes at the
expense of readability.

Nikhil

On 9 Dec 2009, at 7:59AM, Titus von der Malsburg wrote:

 On Wed, Dec 9, 2009 at 12:11 AM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:

 Here are a couple of solutions.  The first uses by and the second sqldf:

 Brilliant!  Now I have a whole collection of solutions.  I did a simple
 performance comparison with a data frame that has 7929 lines.

 The results were as following (loading appropriate packages is not included in
 the measurements):

 times - c(0.248, 0.551, 41.080, 0.16, 0.190)
 names(times) - c(aggregate,summaryBy,by+transform,sqldf,tapply)
 barplot(times, log=y, ylab=log(s))

 So sqldf clearly wins followed by tapply and aggregate.  summaryBy is slower
 than necessary because it computes for x and dur both, mean /and/ sum.
 by+transform presumably suffers from the contruction of many intermediate data
 frames.

 Are there any canonical places where R-recipes are collected?  If yes I would
 write-up a summary.

 These were the competitors:

 # Gary's and Nikhil's aggregate solution:

 aggregate.fixations1 - function(d) {

  idx  - c(TRUE,diff(d$roi)!=0)
  d2     - d[idx,]

  idx  - cumsum(idx)
  d2$dur - aggregate(d$dur, list(idx), sum)[2]
  d2$x   - aggregate(d$x, list(idx), mean)[2]

  d2
 }

 # Marek's symmaryBy:

 library(doBy)

 aggregate.fixations2 - function(d) {

  idx  - c(TRUE,diff(d$roi)!=0)
  d2     - d[idx,]

  d$idx  - cumsum(idx)
  d2$r - summaryBy(dur+x~idx, data=d, FUN=c(sum,
 mean))[c(dur.sum, x.mean)]
  d2
 }

 # Gabor's by+transform solution:

 aggregate.fixations3 - function(d) {

  idx  - cumsum(c(TRUE,diff(d$roi)!=0))

  d2 - do.call(rbind, by(d, idx, function(x)
                transform(x, dur = sum(dur), x = mean(x))[1,,drop = FALSE ]))

  d2
 }

 # Gabor's sqldf solution:

 library(sqldf)

 aggregate.fixations4 - function(d) {

  idx  - c(TRUE,diff(d$roi)!=0)
  d2     - d[idx,]

  d$idx  - cumsum(idx)
  d2$r - sqldf(select sum(dur), avg(x) x from d group by idx)

  d2
 }

 # Titus' solution using plain old tapply:

 aggregate.fixations5 - function(d) {

  idx  - c(TRUE,diff(d$roi)!=0)
  d2     - d[idx,]

  idx  - cumsum(idx)
  d2$dur - tapply(d$dur, idx, sum)
  d2$x - tapply(d$x, idx, mean)

  d2
 }

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



-- 
Marek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plotting frequency curve over histogram

2009-12-09 Thread Gaurav Moghe
Hello,

This is a problem for which there seem to be several solutions online, but
not really. My question was about plotting a curve over the histogram. All
the previous posts and messages talk about generating a *density
histogram*using (freq=F) and then plotting the density curve. However,
I find that
that seriously distorts my data and the plot becomes confounding to the
viewer.

I was wondering if there's a way to do the following 2 things:
1) Plot both histogram and the overlying frequency curve in one plot
2) Plot multiple frequency curves in a single plot

I have been using the hist function for my job.

I'd appreciate if anyone could help me with the solution

Thanks,
Gaurav

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difficulty with terminal properly displaying help function in an ESS remote session

2009-12-09 Thread Matthew Keller
Hi all,

I'm logging into a Debian server and running R remotely using ESS. The
steps I use to do this are below (pasted from my webpage). However,
we're having a problem whenever we want to use the help function,
e.g.,
?hist

The remote buffer gives a warning:
 WARNING: terminal is not fully functional
-  (press RETURN)

At this point we can't get back to our normal R session. When I use
ESS locally on my MacPro, help screens open up in a split buffer below
the one I'm working on, which is fine.

So there are two issues:
1) How do we switch from the help screen back to the R session?
2) Is it possible to have help screens open up in separate buffers or
in split buffers when using ESS remotely?

Any help is appreciated!

Matt


Steps we take to use ESS remotely:
1) Open up your *.R script you’d like to use

2) Open a shell inside Emacs by typing “M-x shell”

3) From within this shell, ssh to the server you want to use. When
doing this, you need to make sure to specify two important ssh
options: compression (which compress data coming to you, making the
connection seem *much* faster) and X11 forwarding (which allows you to
use interactive graphing features via X11). E.g.:
ssh -XC usern...@servername.colorado.edu

4) You should now be logged into the server, just as you wold be if
you’d used terminal rather than emacs. Now open up R as you usually
would on that server. E.g.:
R --arch=x86_64

5) You should be in R now. To allow this R session to be linked to
your *.R script, use this command in the remote R session:
M-x ess-remote
In the Emacs mini-buffer prompt, type:
r

6) Now you should be able to send code from your *.R script to the
remote R session as you normally would (e.g., C-c C-j).

7) Last, you might need to change the options in your remote R session
to graph using X11 rather than whatever default driver is being used.
To do this in R, type:
options(device=’x11’)

8) That’s it. Make sure it all works by typing something like:
hist(rnorm(50)) #which should return a histogram of rnorm to your screen!


-- 
Matthew C Keller
Asst. Professor of Psychology
University of Colorado at Boulder
www.matthewckeller.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Exporting Contingency Tables with xtable

2009-12-09 Thread Na'im R. Tyson

Gabor,

Thanks for the advice.  Using the 'rowlabel' switch works, but when  
used with the 'collabel' switch, I received the following error in  
Latex 2e:


! LaTeX Error: Illegal character in array arg.

The line that LaTeX has issue with is:

\multicolumn{1}{observed}{uh}

The entire table looks like:

\begin{table}[!tbp]
 \begin{center}
 \begin{tabular}{lrr}\hline\hline
\multicolumn{1}{l}{predicted}
\multicolumn{1}{observed}{uh}
\multicolumn{1}{l}{uh~}
\tabularnewline \hline
uh$201$$30$\tabularnewline
uh~$  6$$10$\tabularnewline
\hline
\end{tabular}
\end{center}
\end{table}

I know that LaTeX has issues with unescaped tildes, but it does not  
explain why I get this error.


Na'im

On Dec 9, 2009, at 5:44 AM, Gabor Grothendieck wrote:


Try the latex function in the Hmisc package.  Using the state.*
variables built into R for sake of example:

library(Hmisc)
latex(table(state.division, state.region), rowlabel = X, collabel =
Y, file = )



On Wed, Dec 9, 2009 at 12:04 AM, Na'im R. Tyson  
nty...@clovermail.net wrote:

Dear R-philes:

I am having an issue with exporting contingency tables with  
xtable().  I set
up a contingency and convert it to a matrix for passing to xtable()  
as shown

below.

v.cont.table - table(v_lda$class, grps,
   dnn=c(predicted, observed))
v.cont.mat - as.matrix(v.cont.table)

Both produce output as follows:

   observed
predicted  uh uh~
 uh  201  30
 uh~   6  10

However, when I construct the latex table with xtable(v.cont.mat),  
I get a

good table without the headings of predicted and observed.

\begin{table}[ht]
\begin{center}
\begin{tabular}{rrr}
 \hline
  uh  uh\~{} \\
 \hline
uh  201   30 \\
 uh\~{}6   10 \\
  \hline
\end{tabular}
\end{center}
\end{table}

Question: is there any easy way to retain or re-insert the  
dimension names

from the contingency table and matrix?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting frequency curve over histogram

2009-12-09 Thread Rolf Turner


On 10/12/2009, at 12:52 PM, Gaurav Moghe wrote:


Hello,

This is a problem for which there seem to be several solutions  
online, but
not really. My question was about plotting a curve over the  
histogram. All

the previous posts and messages talk about generating a *density
histogram*using (freq=F) and then plotting the density curve. However,
I find that
that seriously distorts my data and the plot becomes confounding to  
the

viewer.


How does this ``distort'' your data?  You are simply changing
the scale on the y-axis.


I was wondering if there's a way to do the following 2 things:
1) Plot both histogram and the overlying frequency curve in one plot


If you want to keep your histogram on the ``count'' scale,
just multiply your density curve by the constant by which
you would have divided the histogram values to change
counts into density values.


2) Plot multiple frequency curves in a single plot


?lines


I have been using the hist function for my job.

I'd appreciate if anyone could help me with the solution


cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting frequency curve over histogram

2009-12-09 Thread Ted Harding
On 09-Dec-09 23:52:20, Gaurav Moghe wrote:
 Hello,
 This is a problem for which there seem to be several solutions online,
 but not really. My question was about plotting a curve over the
 histogram.
 All the previous posts and messages talk about generating a *density
 histogram* using (freq=F) and then plotting the density curve.
 However, I find that that seriously distorts my data and the plot
 becomes confounding to the viewer.
 
 I was wondering if there's a way to do the following 2 things:
 1) Plot both histogram and the overlying frequency curve in one plot
 2) Plot multiple frequency curves in a single plot
 
 I have been using the hist function for my job.
 
 I'd appreciate if anyone could help me with the solution
 
 Thanks,
 Gaurav

You presumably mean that the viewer expects to see a histogram of
counts, with the corresponding estimated curve of expected counts
for each bin-interval (NB *not* density!!) plotted over it.

The following is an example of how to achieve this.

  set.seed(54321)
  N - 1000
  x - rnorm(N)
  H - hist(x,breaks=50)
  dx - (H$breaks[2]-H$breaks[1])
  m  - mean(x)
  s  - sd(x)
  x0 - H$breaks
  x1 - c(x0[1]-dx/2,x0+dx/2)
  y0 - H$counts
  lines(x1,N*dnorm((x1 - m)/s)*dx)

In the above, m and s are the estimated Mean and SD of the fitted
Normal distgribution. Therefore the estimated *density* at x is

  dnorm((x - m)/s)*dx

and a good approximation to the probability contained in a given
bin whose midpoint is at x1 is dnorm((x1 - m)/s)*dx, where dx is
the width of the bin. The total sample size being N, the expected
count for that bin is N*dnorm((x1 - m)/s)*dx.

With this explanation, the above should now be clear!

Ted.



E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 10-Dec-09   Time: 00:51:58
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Evaluating a function within a pre-defined environment?

2009-12-09 Thread Charles C. Berry

On Wed, 9 Dec 2009, David Reiss wrote:


Ideally I would like to be able to use the function f (in my example)
as-is, without having to designate the environment as an argument, or
to otherwise have to use e$x in the function body.

thanks for any further advice...


Perhaps you want something along the lines of the open.account example of

R-intro 10.7 Scope

??

Chuck



On Wed, Dec 9, 2009 at 2:36 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:

e - new.env()
e$x - 2
f - function(a, e) { e$x - e$x + a; e$x }
f(3, e)
e$x # 5

Another way to accomplish this is to use the proto package which puts
the whole thing into an object oriented framework.  See
http://r-proto.googlecode.com

library(proto)
p - proto(x = 2, f = function(this, a) { this$x - this$x + a; this$x })
p$f(3) # 5


On Wed, Dec 9, 2009 at 4:54 PM, David Reiss dre...@systemsbiology.org wrote:

Hi all,

I have a somewhat confusing question that I was wondering if someone
could help with. I have a pre-defined environment with some variables,
and I would like to define a function, such that when it is called, it
actually manipulates the variables in that environment, leaving them
to be examined later. I see from the R language definition that

When a function is called, a new environment (called the evaluation
environment) is created, whose enclosure (see Environment objects) is
the environment from the function closure. This new environment is
initially populated with the unevaluated arguments to the function; as
evaluation proceeds, local variables are created within it.

So basically, I think I am asking if it is possible to pre-create my
own evaluation environment and have it retain the state that it was
in at the end of the function call?

Example:

e - new.env()
e$x - 3
f - function(xx) x - x + xx

can I then call f(2) and have it leave e$x at 5 after the function
returns? I know that

environment(f) - e

goes part of the way, but I would like to let the function also write
to the environment.

Thanks for any advice.

--David

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to figure out which the version of split is used?

2009-12-09 Thread Peng Yu
There are a number of functions that are dispatched to from split().

 methods('split')
[1] split.data.frame split.Date   split.defaultsplit.POSIXct

Is there a way to figure out which of these variants is actually
dispatched to when I call split? I know that if the argument is of the
type data.frame, split.data.frame will be called? Is it the case that
if the argument is not of type data.frame, Date or POSIXct,
split.default will be called?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confint for glm (general linear model)

2009-12-09 Thread David Winsemius


On Dec 9, 2009, at 9:21 PM, casperyc wrote:



does no one know this?


Have you read the Posting Guide?


--
View this message in context: 
http://n4.nabble.com/confint-for-glm-general-linear-model-tp954071p956658.html
Sent from the R help mailing list archive at Nabble.com.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Have you used RGoogleDocs and RGoogleData?

2009-12-09 Thread Farrel Buchinsky
Both of these applications fulfill a great need of mine: to read data
directly from google spreadsheets that are private to myself and one or two
collaborators. Thanks to the authors. I had been using RGoogleDocs for the
about 6 months (maybe more) but have had to stop using it in the past month
since for some reason that I do not understand it no longer reads google
spreadsheets. I loved it. Its loss depresses me. I started using RGoogleData
which works.

I have noticed that both packages read data slowly. RGoogleData is much
slower than RGoogleDocs used to be. Both seem a lot slower than if one
manually downloaded a google spreadsheet as a csv and then used read.csv
function - but then I would not be able to use scripts and execute without
finding and futzing.

Can anyone explain in English why these packages read slower than a csv
download?
Can anyone explain what the core difference is between the two packages?
Can anyone share their experience with reading Google data straight into R?

Farrel Buchinsky
Google Voice Tel: (412) 567-7870

Sent from Pittsburgh, Pennsylvania, United States

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confint for glm (general linear model)

2009-12-09 Thread casperyc

I think the help page are exactly the same...
I just want to verify the confidence interval manually. That's all I want.

Thanks.

casper



brestat wrote:
 
 This functions are different. I advice you study them:
 
 ?confint # profile likelihood
 ?confint.default # t-distribution
 
 Walmes Zeviani - Brazil
 
 
 
 casperyc wrote:
 
 Hi,
 
 I have a glm gives summary as follows,
 
Estimate Std. Errorz valuePr(|z|)
 (Intercept) -2.03693352 1.449574526 -1.405194 0.159963578
 A0.01093048   0.006446256  1.695633 0.089955471
 N0.41060119  0.224860819  1.826024 0.067846690
 S   -0.20651005  0.067698863 -3.050421 0.002285206
 
 then I use confint(k.glm) to obtain a confidnece interval for the
 estimates.
 
 confint(k.glm,level=0.97)
 Waiting for profiling to be done...
1.5 %  98.5 %
 (Intercept) -5.471345995  0.94716503
 A   -0.002340863  0.02631582
 N   -0.037028592  0.95590178
 S   -0.365570347 -0.06573675
 
 while reading the help for 'confint', i found something like confint.glm
 for general linear model.
 I load the MASS package by clicking on the Menu( or otherwise how should
 I load the package?)
 
 then I still cant use the confint.glm command, what have I dont wrong?
 
 
 How do I calculate this confidence interval for glm estimate manually??
 
 for A, I use
 0.01093048 + c(-1,1) * 0.006446256 * qt(0.985,df=77)
 which is a different interval i got from the confint(k.glm,level=0.97)
 above.
 
 To be short, what's the right command to find the confidence interval for
 glm estimats?
 How do I verify it manully?
 
 Thanks.
 
 casper
 
 
 
 
 

-- 
View this message in context: 
http://n4.nabble.com/confint-for-glm-general-linear-model-tp954071p956671.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confint for glm (general linear model)

2009-12-09 Thread David Winsemius


On Dec 9, 2009, at 9:50 PM, casperyc wrote:



I think the help page are exactly the same...


I cannot tell what you maen by this.

I just want to verify the confidence interval manually. That's all I  
want.


Then provide some reproducible code and data. ... as the Posting Guide  
explains and provides explicit examples of the manner for most  
effectively presenting data objects.


--
David



Thanks.

casper



brestat wrote:


This functions are different. I advice you study them:

?confint # profile likelihood
?confint.default # t-distribution

Walmes Zeviani - Brazil



casperyc wrote:


Hi,

I have a glm gives summary as follows,

  Estimate Std. Errorz value 
Pr(|z|)

(Intercept) -2.03693352 1.449574526 -1.405194 0.159963578
A0.01093048   0.006446256  1.695633 0.089955471
N0.41060119  0.224860819  1.826024 0.067846690
S   -0.20651005  0.067698863 -3.050421 0.002285206

then I use confint(k.glm) to obtain a confidnece interval for the
estimates.


confint(k.glm,level=0.97)

Waiting for profiling to be done...
  1.5 %  98.5 %
(Intercept) -5.471345995  0.94716503
A   -0.002340863  0.02631582
N   -0.037028592  0.95590178
S   -0.365570347 -0.06573675

while reading the help for 'confint', i found something like  
confint.glm

for general linear model.
I load the MASS package by clicking on the Menu( or otherwise how  
should

I load the package?)

then I still cant use the confint.glm command, what have I dont  
wrong?



How do I calculate this confidence interval for glm estimate  
manually??


for A, I use
0.01093048 + c(-1,1) * 0.006446256 * qt(0.985,df=77)
which is a different interval i got from the  
confint(k.glm,level=0.97)

above.

To be short, what's the right command to find the confidence  
interval for

glm estimats?
How do I verify it manully?

Thanks.

casper








--
View this message in context: 
http://n4.nabble.com/confint-for-glm-general-linear-model-tp954071p956671.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Rd] split() is slow on data.frame (PR#14123)

2009-12-09 Thread Peng Yu
I make a version for matrix. Because, it would be more efficient to
split each column of a matrix than to convert a matrix to a data.frame
then call split() on the data.frame. Note that the version for a
matrix and a data.frame is slightly different. Would somebody add this
in R as well?

split.matrix-function(x,f) {
  #print('processing matrix')
  v=lapply(
  1:dim(x)[[2]]
  , function(i) {
base:::split.default(x[,i],f)#the difference is here
  }
  )

  w=lapply(
  seq(along=v[[1]])
  , function(i) {
result=do.call(
cbind
, lapply(v,
function(vj) {
  vj[[i]]
}
)
)
colnames(result)=colnames(x)
return(result)
  }
  )
  names(w)=names(v[[1]])
  return(w)
}


On Wed, Dec 9, 2009 at 5:44 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote:
 On Wed, 9 Dec 2009, William Dunlap wrote:

 Here are some differences between the current and proposed
 split.data.frame.

 Adding 'drop=FALSE' fixes this case. See in line correction below.

 Chuck


 d-data.frame(Matrix=I(matrix(1:10, ncol=2)),

 Named=c(one=1,two=2,three=3,four=4,five=5),
 row.names=as.character(1001:1005))

 group-c(A,B,A,A,B)
 split.data.frame(d,group)

 $A
    Matrix.1 Matrix.2 Named
 1001        1        6     1
 1003        3        8     3
 1004        4        9     4

 $B
    Matrix.1 Matrix.2 Named
 1002        2        7     2
 1005        5       10     5

 mysplit.data.frame(d,group) # lost row.names and 2nd column of Matrix

 [1] processing data.frame
 $A
    Matrix Named
 [1,]      1     1
 [2,]      3     3
 [3,]      4     4

 $B
    Matrix Named
 [1,]      2     2
 [2,]      5     5


 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

 -Original Message-
 From: r-devel-boun...@r-project.org
 [mailto:r-devel-boun...@r-project.org] On Behalf Of
 pengyu...@gmail.com
 Sent: Wednesday, December 09, 2009 2:10 PM
 To: r-de...@stat.math.ethz.ch
 Cc: r-b...@r-project.org
 Subject: [Rd] split() is slow on data.frame (PR#14123)

 Please see the following code for the runtime comparison between
 split() and mysplit.data.frame() (they do the same thing
 semantically). mysplit.data.frame() is a fix of split() in term of
 performance. Could somebody include this fix (with possible checking
 for corner cases) in future version of R and let me know the inclusion
 of the fix?

 m=30
 n=6
 k=3

 set.seed(0)
 x=replicate(n,rnorm(m))
 f=sample(1:k, size=m, replace=T)

 mysplit.data.frame-function(x,f) {
  print('processing data.frame')
  v=lapply(
      1:dim(x)[[2]]
      , function(i) {
        split(x[,i],f)

 Change to:

         split(x[,i,drop=FALSE],f)


      }
      )

  w=lapply(
      seq(along=v[[1]])
      , function(i) {
        result=do.call(
            cbind
            , lapply(v,
                function(vj) {
                  vj[[i]]
                }
                )
            )
        colnames(result)=colnames(x)
        return(result)
      }
      )
  names(w)=names(v[[1]])
  return(w)
 }

 system.time(split(as.data.frame(x),f))
 system.time(mysplit.data.frame(as.data.frame(x),f))

 __
 r-de...@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


 __
 r-de...@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


 Charles C. Berry                            (858) 534-2098
                                            Dept of Family/Preventive
 Medicine
 E mailto:cbe...@tajo.ucsd.edu               UC San Diego
 http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Rd] split() is slow on data.frame (PR#14123)

2009-12-09 Thread Peng Yu
Sorry. I sent this to r-help by mistake. Could somebody help delete it
from the archive?

On Wed, Dec 9, 2009 at 9:29 PM, Peng Yu pengyu...@gmail.com wrote:
 I make a version for matrix. Because, it would be more efficient to
 split each column of a matrix than to convert a matrix to a data.frame
 then call split() on the data.frame. Note that the version for a
 matrix and a data.frame is slightly different. Would somebody add this
 in R as well?

 split.matrix-function(x,f) {
  #print('processing matrix')
  v=lapply(
      1:dim(x)[[2]]
      , function(i) {
        base:::split.default(x[,i],f)#the difference is here
      }
      )

  w=lapply(
      seq(along=v[[1]])
      , function(i) {
        result=do.call(
            cbind
            , lapply(v,
                function(vj) {
                  vj[[i]]
                }
                )
            )
        colnames(result)=colnames(x)
        return(result)
      }
      )
  names(w)=names(v[[1]])
  return(w)
 }


 On Wed, Dec 9, 2009 at 5:44 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote:
 On Wed, 9 Dec 2009, William Dunlap wrote:

 Here are some differences between the current and proposed
 split.data.frame.

 Adding 'drop=FALSE' fixes this case. See in line correction below.

 Chuck


 d-data.frame(Matrix=I(matrix(1:10, ncol=2)),

 Named=c(one=1,two=2,three=3,four=4,five=5),
 row.names=as.character(1001:1005))

 group-c(A,B,A,A,B)
 split.data.frame(d,group)

 $A
    Matrix.1 Matrix.2 Named
 1001        1        6     1
 1003        3        8     3
 1004        4        9     4

 $B
    Matrix.1 Matrix.2 Named
 1002        2        7     2
 1005        5       10     5

 mysplit.data.frame(d,group) # lost row.names and 2nd column of Matrix

 [1] processing data.frame
 $A
    Matrix Named
 [1,]      1     1
 [2,]      3     3
 [3,]      4     4

 $B
    Matrix Named
 [1,]      2     2
 [2,]      5     5


 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

 -Original Message-
 From: r-devel-boun...@r-project.org
 [mailto:r-devel-boun...@r-project.org] On Behalf Of
 pengyu...@gmail.com
 Sent: Wednesday, December 09, 2009 2:10 PM
 To: r-de...@stat.math.ethz.ch
 Cc: r-b...@r-project.org
 Subject: [Rd] split() is slow on data.frame (PR#14123)

 Please see the following code for the runtime comparison between
 split() and mysplit.data.frame() (they do the same thing
 semantically). mysplit.data.frame() is a fix of split() in term of
 performance. Could somebody include this fix (with possible checking
 for corner cases) in future version of R and let me know the inclusion
 of the fix?

 m=30
 n=6
 k=3

 set.seed(0)
 x=replicate(n,rnorm(m))
 f=sample(1:k, size=m, replace=T)

 mysplit.data.frame-function(x,f) {
  print('processing data.frame')
  v=lapply(
      1:dim(x)[[2]]
      , function(i) {
        split(x[,i],f)

 Change to:

         split(x[,i,drop=FALSE],f)


      }
      )

  w=lapply(
      seq(along=v[[1]])
      , function(i) {
        result=do.call(
            cbind
            , lapply(v,
                function(vj) {
                  vj[[i]]
                }
                )
            )
        colnames(result)=colnames(x)
        return(result)
      }
      )
  names(w)=names(v[[1]])
  return(w)
 }

 system.time(split(as.data.frame(x),f))
 system.time(mysplit.data.frame(as.data.frame(x),f))

 __
 r-de...@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


 __
 r-de...@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


 Charles C. Berry                            (858) 534-2098
                                            Dept of Family/Preventive
 Medicine
 E mailto:cbe...@tajo.ucsd.edu               UC San Diego
 http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Assigning variables into an environment.

2009-12-09 Thread Rolf Turner


I am working with a somewhat complicated structure in which
I need to deal with a function that takes ``basic'' arguments
and also depends on a number of parameters which change depending
on circumstances.

I thought that a sexy way of dealing with this would be to assign
the parameters as objects in the environment of the function in
question.

The following toy example gives a bit of the flavour of what I
am trying to do:

foo - function(x,zeta) {
for(nm in names(zeta)) assign(nm,zeta[nm],envir=environment(bar))
bar(x)
}

bar - function(x) {
alpha + beta*exp(gamma*x)
}

v - c(alpha=2,beta=3,gamma=-4)

ls()
[1] bar foo v

foo(0.1,v)
  alpha
4.01096

2+3*exp(-4*0.1)
[1] 4.01096  # Check; yes it's working; but ...

ls()
[1] alpha bar   beta  foo   gamma v

The parameters got assigned in the global environment (as well as in
the environment of bar()? Or instead of?).

I didn't want that to happen.

Questions:

(a) What did I do wrong?
(b) What am I not understanding about environments?
(c) How can I get the parameters to be assigned in the environment of  
bar()

and ***NOT*** in the global environment?
(d) Is it time to go to the pub yet?

[Please don't make suggestions about doing it all some other way, e.g.
using the ``...'' argument facility.  I know there are other ways to
attack my problem (but they may not be so efficacious in the context
of my real --- as opposed to toy --- example).  I want to try to get
the environment idea to work, and I want to understand more about
environments and how they work.]

Thanks for any insights.

cheers,

Rolf Turner

P. S. Are there any articles about environments (in the sense used  
above)
out there in the literature?  I searched the contents of the R  
Journal/R News

but turned up nothing on environments.

R. T.

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to make the assignment in a for-loop not affect variables outside the loop?

2009-12-09 Thread Peng Yu
2009/11/22 Uwe Ligges lig...@statistik.tu-dortmund.de:
 Either use local as in:

 n=10

 local(for(i in 1:n){
     n=3
     print(n)
 })

 print(n)

'local()' makes everything inside it unavailable outside of it. Is
there a way to make 'n' unavailable outside but still make 'b'
available outside without using 'function'?

n=10
b=1
local(
for(i in 1:n) {
  n=3
  print(n)
  b=b*i
}
)

print(n)
print(b)


 or write a function that is evaluated in its own environment:

 n=10

 MyLoopFoo - function(){
    for(i in 1:n){
        n - 3
        print(n)
    }
 }

 MyLoopFoo()

 print(n)




 Uwe Ligges


 Peng Yu wrote:

 I know that R is a dynamic programming language. But I'm wondering if
 there is a way to make the assignment in a for-loop not affect
 variables outside the loop.

 n=10
 for(i in 1:n){

 +     n=3
 +     print(n)
 + }
 [1] 3
 [1] 3
 [1] 3
 [1] 3
 [1] 3
 [1] 3
 [1] 3
 [1] 3
 [1] 3
 [1] 3

 print(n)

 [1] 3

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What is the function to test if a vector is ordered or not?

2009-12-09 Thread Peng Yu
I did a search on www.rseek.org to look for the function to test if a
vector is ordered or not. But I don't find it. Could somebody let me
know what function I should use?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Counting Frequencies

2009-12-09 Thread BIGBEEF

Hi - I'm having difficulty with frequencies in R. I have a table with a
variable (column) called difference 600 observations (rows). I would like
to know how many values are  -0.5 as well as how many are  0.5. The rest
are obviously in the middle.

In SAS I could this immediately but am unable to do it in R.

Thanks for your help.
-- 
View this message in context: 
http://n4.nabble.com/Counting-Frequencies-tp956556p956556.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confint for glm (general linear model)

2009-12-09 Thread brestat

This functions are different. I advice you study them:

?confint # profile likelihood
?confint.default # t-distribution

Walmes Zeviani - Brazil



casperyc wrote:
 
 Hi,
 
 I have a glm gives summary as follows,
 
Estimate Std. Errorz valuePr(|z|)
 (Intercept) -2.03693352 1.449574526 -1.405194 0.159963578
 A0.01093048   0.006446256  1.695633 0.089955471
 N0.41060119  0.224860819  1.826024 0.067846690
 S   -0.20651005  0.067698863 -3.050421 0.002285206
 
 then I use confint(k.glm) to obtain a confidnece interval for the
 estimates.
 
 confint(k.glm,level=0.97)
 Waiting for profiling to be done...
1.5 %  98.5 %
 (Intercept) -5.471345995  0.94716503
 A   -0.002340863  0.02631582
 N   -0.037028592  0.95590178
 S   -0.365570347 -0.06573675
 
 while reading the help for 'confint', i found something like confint.glm
 for general linear model.
 I load the MASS package by clicking on the Menu( or otherwise how should I
 load the package?)
 
 then I still cant use the confint.glm command, what have I dont wrong?
 
 
 How do I calculate this confidence interval for glm estimate manually??
 
 for A, I use
 0.01093048 + c(-1,1) * 0.006446256 * qt(0.985,df=77)
 which is a different interval i got from the confint(k.glm,level=0.97)
 above.
 
 To be short, what's the right command to find the confidence interval for
 glm estimats?
 How do I verify it manully?
 
 Thanks.
 
 casper
 
 
 

-- 
View this message in context: 
http://n4.nabble.com/confint-for-glm-general-linear-model-tp954071p95.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset sum problem.

2009-12-09 Thread Geert Janssens
On Wednesday 9 December 2009, Hans W Borchers wrote:
 Geert Janssens janssens-geert at telenet.be writes:
  Hi,
 
  I'm quite new to the R-project. I was suggested to look into it because I
  am trying to solve the Subset sum problem, which basically is:
  Given a set of integers and an integer s, does any non-empty subset sum
  to s? (See http://en.wikipedia.org/wiki/Subset_sum_problem)
 
  I have been searching the web for quite some time now (which is how I
  eventually discovered that my problem is called subset sum), but I can't
  seem to find an easily applicable implementation. I did search the list
  archive, the R website and used the help.search and apropos function. I'm
  afraid nothing obvious showed up for me.
 
  Has anybody tackled this issue before in R ? If so, I would be very
  grateful if you could share your solution with me.

 Is it really true that you only want to see a Yes or No answer to this
 question whether a subset sums up to s --- without learning which numbers
 this subset is composed of (the pure SUBSET SUM problem)?
 Then the following procedure does that in a reasonable amount of time
 (returning 'TRUE' or 'FALSE' instead of Y-or-N):

Unfortunatly no. I do need the numbers in the subset. But thank you for 
presenting this code.

Geert

 # Exact algorithm for the SUBSET SUM problem
 exactSubsetSum - function(S, t) {
   S - S[S = t]
   if (sum(S)  t) return(FALSE)
   S - sort(S, decreasing=TRUE)
   n - length(S)
   L - c(0)
   for (i in 1:n) {
 L - unique(sort(c(L, L + S[i])))
 L - L[L = t]
 if (max(L) == t) return(TRUE)
   }
   return(FALSE)
 }

 # Example with a set of cardinality 64
 amount - 4748652
 products -
 c(30500,30500,30500,30500,42000,42000,42000,42000,
   42000,42000,42000,42000,42000,42000,71040,90900,
   76950,35100,71190,53730,456000,70740,70740,533600,
   83800,59500,27465,28000,28000,28000,28000,28000,
   26140,49600,77000,123289,27000,27000,27000,27000,
   27000,27000,8,33000,33000,55000,77382,48048,
   51186,4,35000,21716,63051,15025,15025,15025,
   15025,80,111,59700,25908,829350,1198000,1031655)

 # Timing is not that bad
 system.time( sol - exactSubsetSum(products, amount) )
 #  user  system elapsed
 # 0.516   0.096   0.673
 sol
 # [1] TRUE



-- 
Kobalt W.I.T.
Web  Information Technology
Brusselsesteenweg 152
1850 Grimbergen

Tel  : +32 479 339 655
Email: i...@kobaltwit.be

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting frequency curve over histogram

2009-12-09 Thread nivas

Hi,
Guarav 
go for this site.He is the one designed R.
http://www.stat.auckland.ac.nz/~ihaka/courses/120/lectures.html

http://www.stat.auckland.ac.nz/~ihaka/?Teaching

It might be helpful.I am not sure.
thanks.

-- 
View this message in context: 
http://n4.nabble.com/Plotting-frequency-curve-over-histogram-tp956565p956592.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] simple data manipulation question

2009-12-09 Thread dolar

Hi there

I have a dataframe of a whole lot of variables

lets say, one of my variables is gender
how do I simply get an average of all other variables by gender?
-- 
View this message in context: 
http://n4.nabble.com/simple-data-manipulation-question-tp956600p956600.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Need help to forecasting the data of the time series .

2009-12-09 Thread nivas

Hi, 
This is the time series data collected from 2001 to 2008 by every
month.so,there are 96 entries.I have done basic statistics.I need to find a
model fitted to forecast this data.This is the mixedpaper collection for
recycling in the campus. 






13251 
13754 
19061 
12631 
17414 
21350 
25384 
23646 
20312 
20740 
14007 
17175 
13910 
17191 
17113 
20250 
35003 
11975 
19665 
20490 
20436 
22885 
17075 
18205 
15720 
25264 
16258 
33430 
31598 
19764 
21006 
29210 
35750 
27881 
25751 
27601 
27316 
20893 
27308 
27182 
28178 
28057 
35623 
51094 
36365 
29301 
18718 
22683 
53898 
40339 
28462 
31555 
32484 
40497 
28547 
40509 
31220 
48399 
38998 
44489 
41588 
47240 
57035 
54919 
50513 
42296 
39124 
36217 
43173 
56311 
50726 
49621 
52430 
56236 
59573 
66819 
66345 
44838 
45847 
51066 
49688 
52978 
45205 
51043 
48693 
65470 
45073 
55923 
58766 
41289 
50514 
45901 
51198 
63914 
57128 
50702 

Please advice me analysis tips for me to understand this data very well. 
thanks, 
-- 
View this message in context: 
http://n4.nabble.com/Need-help-to-forecasting-the-data-of-the-time-series-tp956593p956593.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What is the function to test if a vector is ordered or not?

2009-12-09 Thread Marc Schwartz

On Dec 9, 2009, at 10:10 PM, Peng Yu wrote:


I did a search on www.rseek.org to look for the function to test if a
vector is ordered or not. But I don't find it. Could somebody let me
know what function I should use?



If by ordered, you mean sorted, then ?is.unsorted

 is.unsorted(c(1, 4, 2, 6, 7))
[1] TRUE

 is.unsorted(sort(c(1, 4, 2, 6, 7)))
[1] FALSE


If you mean to test a factor to see if it is an ordered factor, then ? 
is.ordered


 is.ordered(factor(letters))
[1] FALSE

 is.ordered(factor(letters, ordered = TRUE))
[1] TRUE


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subset sum problem.

2009-12-09 Thread Hans W Borchers
Geert Janssens janssens-geert at telenet.be writes:
 
 On Wednesday 9 December 2009, Hans W Borchers wrote:
  Geert Janssens janssens-geert at telenet.be writes:
   [ ... ]
   Has anybody tackled this issue before in R ? If so, I would be very
   grateful if you could share your solution with me.
 
  Is it really true that you only want to see a Yes or No answer to this
  question whether a subset sums up to s --- without learning which numbers
  this subset is composed of (the pure SUBSET SUM problem)?
  Then the following procedure does that in a reasonable amount of time
  (returning 'TRUE' or 'FALSE' instead of Y-or-N):
 
 Unfortunately no. I do need the numbers in the subset. But thank you for 
 presenting this code.
 
 Geert
 

Okay then, here we go. But don't tell later that your requirement was to
generate _all_ subsets that add up to a certain amount.  I will generate
only one (with largest elements).

For simplicity I assume that the set is prepared s.t. it is decreasingly
ordered, has no elements larger than the amount given, and has a total sum
larger than this amount.

# Assume S decreasing, no elements  t, total sum = t
solveSubsetSum - function(S, t) {
  L - c(0)
  inds - NULL
  for (i in 1:length(S)) {
# L - unique(sort(c(L, L + S[i])))
L - c(L, L+S[i])
L - L[L = t]
if (max(L) == t) {
  inds - c(i)
  t - t - S[i]
  while (t  0) {
K - c(0)
for (j in 1:(i-1)) {
  K - c(K, K+S[j])
  if (t %in% K) break
}
inds - c(inds, j)
t - t - S[j]
  }
  break
}
  }
  return(inds)
}

# former example
amount - 4748652
products - 
c(30500,30500,30500,30500,42000,42000,42000,42000,
  42000,42000,42000,42000,42000,42000,71040,90900,
  76950,35100,71190,53730,456000,70740,70740,533600,
  83800,59500,27465,28000,28000,28000,28000,28000,
  26140,49600,77000,123289,27000,27000,27000,27000,
  27000,27000,8,33000,33000,55000,77382,48048,
  51186,4,35000,21716,63051,15025,15025,15025,
  15025,80,111,59700,25908,829350,1198000,1031655)

# prepare set
prods - products[products = amount]  # no elements  amount
prods - sort(prods, decreasing=TRUE)  # decreasing order

# now find one solution
system.time(is - solveSubsetSum(prods, amount))
#  user  system elapsed 
# 0.320   0.032   0.359 

prods[is]
#  [1]   70740   70740   71190   76950   77382   8   83800
#  [8]   90900  456000  533600  829350 111 1198000

sum(prods[is]) == amount
# [1] TRUE

Note that running times and memory needs will be much higher when more
summands are needed.  To mention that too: I have not tested the code
extensively.

Regards
Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning variables into an environment.

2009-12-09 Thread Gabor Grothendieck
R uses lexical scoping, not dynamic scoping.  It does not matter where
bar is called from. What matters is where bar was defined and since
bar was defined in the global environment that is where its free
variables are looked up thus environment(bar) is the global
environment.  Try changing foo to this which creates a new function,
also called bar, which only exists for the duration of the call to foo
and its free variables are looked up within foo.  We assign the
variables within foo and then call this newly created bar.

foo - function(x,zeta) {
  environment(bar) - environment()
  for(nm in names(zeta)) assign(nm,zeta[nm])
  bar(x)
}

If we regard bar as a method acting on an object whose variables are
alpha, beta and gamma then we can use the proto package like this:


library(proto)  # see http://r-proto.googlecode.com

p - proto(alpha = 2, beta = 3, gamma = -4,
bar = function(this, x) this$alpha + this$beta * exp(this$gamma + x))
p$bar(0.1)

# we could change any of the variables and run again
# e.g.  p$alpha - 3; p$bar(0.1)

# or we could create a child of p, q.
# q inherits all of p's variables and methods
# however, let's explicitly override alpha
q - p$proto(alpha = 12)
q$bar(0.1)






On Wed, Dec 9, 2009 at 10:43 PM, Rolf Turner r.tur...@auckland.ac.nz wrote:

 I am working with a somewhat complicated structure in which
 I need to deal with a function that takes ``basic'' arguments
 and also depends on a number of parameters which change depending
 on circumstances.

 I thought that a sexy way of dealing with this would be to assign
 the parameters as objects in the environment of the function in
 question.

 The following toy example gives a bit of the flavour of what I
 am trying to do:

 foo - function(x,zeta) {
 for(nm in names(zeta)) assign(nm,zeta[nm],envir=environment(bar))
 bar(x)
 }

 bar - function(x) {
 alpha + beta*exp(gamma*x)
 }

 v - c(alpha=2,beta=3,gamma=-4)

 ls()
 [1] bar foo v

 foo(0.1,v)
  alpha
 4.01096

 2+3*exp(-4*0.1)
 [1] 4.01096      # Check; yes it's working; but ...

 ls()
 [1] alpha bar   beta  foo   gamma v

 The parameters got assigned in the global environment (as well as in
 the environment of bar()? Or instead of?).

 I didn't want that to happen.

 Questions:

 (a) What did I do wrong?
 (b) What am I not understanding about environments?
 (c) How can I get the parameters to be assigned in the environment of bar()
    and ***NOT*** in the global environment?
 (d) Is it time to go to the pub yet?

 [Please don't make suggestions about doing it all some other way, e.g.
 using the ``...'' argument facility.  I know there are other ways to
 attack my problem (but they may not be so efficacious in the context
 of my real --- as opposed to toy --- example).  I want to try to get
 the environment idea to work, and I want to understand more about
 environments and how they work.]

 Thanks for any insights.

        cheers,

                Rolf Turner

 P. S. Are there any articles about environments (in the sense used above)
 out there in the literature?  I searched the contents of the R Journal/R
 News
 but turned up nothing on environments.

                R. T.

 ##
 Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What is the function to test if a vector is ordered or not?

2009-12-09 Thread jim holtman
Try

all(diff(order(yourVector)) == 1)

On Wed, Dec 9, 2009 at 10:10 PM, Peng Yu pengyu...@gmail.com wrote:

 I did a search on www.rseek.org to look for the function to test if a
 vector is ordered or not. But I don't find it. Could somebody let me
 know what function I should use?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning variables into an environment.

2009-12-09 Thread hadley wickham
On Wed, Dec 9, 2009 at 9:43 PM, Rolf Turner r.tur...@auckland.ac.nz wrote:

 I am working with a somewhat complicated structure in which
 I need to deal with a function that takes ``basic'' arguments
 and also depends on a number of parameters which change depending
 on circumstances.

 I thought that a sexy way of dealing with this would be to assign
 the parameters as objects in the environment of the function in
 question.

 The following toy example gives a bit of the flavour of what I
 am trying to do:

 foo - function(x,zeta) {
 for(nm in names(zeta)) assign(nm,zeta[nm],envir=environment(bar))
 bar(x)
 }

 bar - function(x) {
 alpha + beta*exp(gamma*x)
 }

 v - c(alpha=2,beta=3,gamma=-4)

 ls()
 [1] bar foo v

 foo(0.1,v)
  alpha
 4.01096

 2+3*exp(-4*0.1)
 [1] 4.01096      # Check; yes it's working; but ...

 ls()
 [1] alpha bar   beta  foo   gamma v

 The parameters got assigned in the global environment (as well as in
 the environment of bar()? Or instead of?).

 I didn't want that to happen.

 Questions:

 (a) What did I do wrong?

The environment of bar is the environment in which it exists - the
global environment.

 (b) What am I not understanding about environments?

See above.

 (c) How can I get the parameters to be assigned in the environment of bar()
    and ***NOT*** in the global environment?

Define foo inside bar and rely on the usual lexical scoping rules.

 (d) Is it time to go to the pub yet?

Yes.

Hadley


-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting Frequencies

2009-12-09 Thread Wensui Liu
 x - runif(10, 0, 1)
 x2 - x  0.5
 x2
 [1]  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE  TRUE FALSE
 table(x2)
x2
FALSE  TRUE
5 5


On Wed, Dec 9, 2009 at 6:36 PM, BIGBEEF martin.beze...@gmail.com wrote:


 Hi - I'm having difficulty with frequencies in R. I have a table with a
 variable (column) called difference 600 observations (rows). I would like
 to know how many values are  -0.5 as well as how many are  0.5. The rest
 are obviously in the middle.

 In SAS I could this immediately but am unable to do it in R.

 Thanks for your help.
 --
 View this message in context:
 http://n4.nabble.com/Counting-Frequencies-tp956556p956556.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
==
WenSui Liu
Blog   : statcompute.spaces.live.com
Tough Times Never Last. But Tough People Do.  - Robert Schuller
==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Population Histogram

2009-12-09 Thread Jim Lemon

On 12/10/2009 05:47 AM, terry johnson wrote:

How would I make a population histogram in R from an excel file? Thanks
   

Hi Terry,

library(gdata)
ozpop-read.xls(ozpop.xls)
xycol-color.gradient(c(0,0,0.5,1),c(0,0,0.5,1),c(1,1,0.5,1),18)
xxcol-color.gradient(c(1,1,0.5,1),c(0.5,0.5,0.5,1),c(0.5,0.5,0.5,1),18)
library(plotrix)
par(mar=pyramid.plot(ozpop$Male,ozpop$Female,labels=ozpop$Age,
 main=Australian population pyramid 2002,xycol=xycol,xxcol=xxcol))

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting Frequencies

2009-12-09 Thread Jim Lemon

On 12/10/2009 10:36 AM, BIGBEEF wrote:

Hi - I'm having difficulty with frequencies in R. I have a table with a
variable (column) called difference 600 observations (rows). I would like
to know how many values are  -0.5 as well as how many are  0.5. The rest
are obviously in the middle.

In SAS I could this immediately but am unable to do it in R.
   

Hi Martin,

difference-runif(100)*2-1
table(cut(difference,
 breaks=c(min(difference)-0.1,-0.5,0.5,max(difference

The -0.1 is to ensure that the lowest value is included in the cut.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Population Histogram

2009-12-09 Thread Jim Lemon

On 12/10/2009 04:39 PM, Jim Lemon wrote:

On 12/10/2009 05:47 AM, terry johnson wrote:

How would I make a population histogram in R from an excel file? Thanks

Hi Terry,

library(gdata)
ozpop-read.xls(ozpop.xls)
xycol-color.gradient(c(0,0,0.5,1),c(0,0,0.5,1),c(1,1,0.5,1),18)
xxcol-color.gradient(c(1,1,0.5,1),c(0.5,0.5,0.5,1),c(0.5,0.5,0.5,1),18)
library(plotrix)
par(mar=pyramid.plot(ozpop$Male,ozpop$Female,labels=ozpop$Age,
 main=Australian population pyramid 2002,xycol=xycol,xxcol=xxcol))

Jim
Oops, sorry, XLS files must be verboten, as the example data I sent 
seems to have disappeared.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.