Re: [R] Date conversion problem using as.Date

2005-03-18 Thread Gabor Grothendieck
Vegard Andersen vegard.andersen at ism.uit.no writes:

: 
: Hello!
: 
: My problem is that the Julian date behind my dates seems to be wrong. I  
: will examplify my problem.
: 
: t1 - 1998-11-20
: t2 - as.Date(t1)
: # Here t2 is correctly 1998-11-20, but
: date.mdy(t2)
: $month
: [1] 11
: $day
: [1] 19
: $year
: [1] 1988
: 
: And indeed, if I write: fix(t2) then I get : structure(10550, class =  
: Date). So the Julian date is 10550, which is 1988-11-19, not the  
: correct 1998-11-20
: 
: If I instead of as.Date use as.date, then things work ok. But I have  
: not found out how to instruct as.date to handle dates from the 21st  
: century.
: 
: I hope that someone can help me, thanks in advance!
: 


As already mentioned the date class in survival uses 1960 as
its origin:

   R as.date(0)
   [1] 1Jan60

whereas the Date class uses 1970:

   R structure(0, class = Date)
   [1] 1970-01-01

Regarding your other question you can use a 4 digit year:

   R as.date(2Jan2001)
   [1] 2Jan2001

or:

   R as.date.Date - function(x) as.date(format(x), order = ymd)
   R as.date.Date(t2)
   [1] 20Nov98

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Is a .R script file name available inside the script?

2005-03-18 Thread Gabor Grothendieck
Darren Weber darrenleeweber at gmail.com writes:

: 
: Hi,
: 
: if we have a file called Rscript.R that contains the following, for example:
: 
: x - 1:100
: outfile = Rscript.Rout
: sink(outfile)
: print(x)
: 
: and then we run
: 
:  source(Rscript.R)
: 
: we get an output file called Rscript.Rout - great!
: 
: Is there an internal variable, something like .Platform, that holds
: the script name when it is being executed?  I would like to use that
: variable to define the output file name.
: 


In R 2.0.1 try putting this in a file and sourcing it.

script.description - function() eval.parent(quote(file), n = 3)
print(basename(script.description()))


If you are using R 2.1.0 (devel) then use this instead:

script.description - function() 
showConnections() [as.character(eval.parent(quote(file), n = 3)), 
description]
print((basename(script.description(

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] newbie question about beta distribution

2005-03-20 Thread Gabor Grothendieck
 faisal99 at inf.its-sby.edu writes:

: 
: hi everyone,
: I'm still a newbie in statistics,
: 
: I have a question about beta distribution, that is,
: 
: On the ref/tutorials I've found on the net, why beta distribution always
: have value p(x) more than 1?


Consider the uniform distribution on the interval (0, 1/a) whose 
probability density graph is a horizontal line at a.  If a  1 then 
the probability density is greater than 1 for every point of its support
showing the the density can indeed exceed 1.

: As I know, any probability density function always have value not more
: than 1?
: 
: is there any one who can explain to me, I'm not statistics people, but I
: need to code that needing some of this distribution function.
:

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Violin plot for discrete variables.

2005-03-21 Thread Gabor Grothendieck
Witold Eryk Wolski W.E.Wolski at ncl.ac.uk writes:

: 
: Dear Rgurus,
: 
: To my knowledge the best way to visualize the distribution of a discrete 
: variable X is
: plot(table(X))
: 
: The problem which I have is the following. I have to discrete variables 
: X and Y which distribution I would like to compare. To overlay the 
: distribution of Y with lines(table(Y)) gives not satisfying results. 
: This is the same in case of using density or histogram.
: 
: Hence, I am wondering if there is a equivalent of the vioplot function 
: (package vioplot) for discrete variables
: which starts with a boxplot and than adds a rotated plot(table()) plot 
: to each side of the box plot.
: 
: Maybee I should ask it first: Does such a plot make any sense? If not 
: are there better solutions?


You could try a barplot or a balloonplot:

tab - table(stack(list(x1 = x1, x2 = x2))) # x1, x2 from Andy's post
barplot(t(tab), beside = TRUE)

library(gplots)
balloonplot(tab)


Although intended for comparing data to a theoretical distribution,
rootogram can compare two discrete distributions:

library(vcd)
rootogram(tab[,1], tab[,2])

Another possibility is to fit each distribution to a parametric form
using vcd::distplot as shown in the examples on its help page.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Read a dataset with different lengths

2005-03-21 Thread Gabor Grothendieck
Xiyan Lon xiyanlon at gmail.com writes:

: 
: Dear useR again,
: How can I read a dataset if lines in dataset did not have same
: elements (have different lengths), For example:
: 
: 12,  4, 16,  1,  1,  3,  1,  1, 15,  5,  1,  1, 14,  1,  1
: 22, 13,  5,  1,  1,  3,  1,  1, 15,  5,  1,  1, 14,  1,  1
: 34,  5, 11,  1,  1,  6,  1,  1,  5, 14,  1,  1, 15,  1,  1
: 42,  5,  9,  1,  1, 14,  1,  1,  8, 16,  1,  1, 13,  1,  1
: 53,  7, 14,  1,  1, 14,  1,  1,  5, 21,  1,  1,  8,  1,  1
: 66,  3,  1, 12,  1,  1,  5,  8,  1,  1, 15,  1,  1
: 76,  3,  1, 11,  1,  1, 10,  7,  1,  1, 21,  1,  1
: 8   21, 20,  9,  1,  1,  6,  1,  1, 13, 10,  1,  1,  1
: 95,  7, 21,  1,  1, 13,  1,  1, 14,  2,  1,  1,  6,  1,  1
: 10   8, 14, 10,  1,  1,  5,  1,  1, 10,  5,  1,  1,  5,  1,  1
: 11   5, 20, 17,  1,  1, 19,  1,  1, 14,  7,  1,  1,  6,  1,  1
: 12   7,  4, 11,  1,  1,  2,  1,  1,  5, 13,  1,  1, 14,  1,  1
: 13   7, 14, 13,  1,  1,  6,  1,  1, 13, 16,  1,  1, 17,  1,  1
: 14   7, 14,  5,  1,  1,  5,  1,  1,  5, 17,  1,  1, 17,  1,  1
: 15   3,  9, 12,  1,  1, 18,  1,  1,  6,  1,  4,  1,  1
: 16   7, 10,  5,  1,  1, 12,  1,  1,  5, 17,  1,  1, 13,  1,  1
: 17  12,  8, 16,  1,  1,  5,  1,  1,  8, 10,  1,  1, 14,  1,  1
: 18   5, 11,  7,  1,  1,  5,  1,  1, 18, 13,  1,  1, 17,  1,  1
: 19   7, 13,  8,  1,  1, 14,  1,  1,  5, 17,  1,  1, 13,  1,  1
: 20   7, 18, 21,  1,  1, 16,  1,  1,  5, 17,  1,  1, 13,  1,  1
: 
: I know that in BioC package rmutil have a function (read.list) to
: handle different lengths sets of lines but it did not work.
:  library(rmutil)
: Error in library(rmutil) : 'rmutil' is not a valid package -- installed  
2.0.0?
:  

rmutil can be found here:
 http://popgen.unimaas.nl/~jlindsey/rcode.html

: 
: Are there any others function to handle this.



nf - count.fields(myfile, sep = ,)
z - read.table(myfile, sep = ,, fill = TRUE, colClass = rep(numeric(), nf))

If the first line is longest you can omit the colClass argument
and the nf computation.

The above returns a data frame with one line per row and NAs at the end
to fill it out as necessary.  If you need a list of rows without the
NAs:

lapply(as.data.frame(t(data.matrix(z))), na.omit)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Convex hull line coordinates..

2005-03-21 Thread Gabor Grothendieck
 achilleas.psomas at wsl.ch writes:

: 
: Hello R-Helpers..
: 
: I am still new in R and I have the following question..
: I am applying the function chull on a 2D dataset and have the convex hull
: nicely
: calculated and plotted.
: Do you know if there is a way to extract the coordinates of the line created
: from the connection of the chull data points..
: I have alredy tried with approx to lineary interpolate but its not working
: correctly since the interpolated values sometimes fall inside the convex .
: Using the yleft or yright doesnt seem to help..
: 
: Any suggestions?

1. First suggestion is not to post by following up on an unrelated thread
since some people won't see it.   e.g. try finding it on gmane.  Its there
but good luck on finding it.

2. Second suggestion is an example which creates a matrix z whose 
columns are the regression coefficients of the successive line 
segments.  Note use of lm's subset= arg to simplify code:

example(chull)  # creates hpts and X and plots convex hull
z - sapply(2:length(hpts), function(i)
coef(lm(X[,2] ~ X[,1], subset = hpts[i-1:0])) ) 

# we can use z to display _full_ lines, on top of the line
# _segments_ that were displyed in example(chull):
for(i in 1:ncol(z)) abline(coef = z[,i], col = red, lty = 2)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] flatten a matrix and unflatten it

2005-03-21 Thread Gabor Grothendieck
Bill Simpson William.Simpson at drdc-rddc.gc.ca writes:

: 
: I want to flatten a matrix and unflatten it again. Please tell me how to 
: do it.
: 
: 1. given a matrix:
: x1 y1 z1
: x2 y2 z2
: ...
: xk yk zk
: convert it to a vector:
: x1, y1, z1, x2, y2, z2, ..., xk, yk, zk
: 
: 2. given a vector:
: x1, y1, z1, x2, y2, z2, ..., xk, yk, zk
: convert it to a matrix
: x1 y1 z1
: x2 y2 z2
: ...
: xk yk zk
: 
: It is known that the number of dimensions is 3.
: 

myvector - c(t(mymatrix))  
mymatrix - matrix(myvector, byrow = TRUE, nc=3)  

If column-wise is ok rather than row-wise as you show, then
omit t() in the first line and byrow = TRUE in the second.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] problem in textConnection function

2005-03-21 Thread Gabor Grothendieck
Michael S michael_shen at hotmail.com writes:

: 
: Dear all-helpers:
: 
: I create one package ,code like this:
: output -
: function(x,y)
: {
:   zz -textConnection(foo,w)
:   sink(zz)
:   a -5
:   b -6
:   z -a*b
:   z
:   e -spss
:   h -c(1,2,3)
:   ls()
:   r-c(s,p,s,s)
:   p-list(1:10)
:   p
:   sink()
: close(zz)
:   x - foo
: y - foo
: # .C(output,as.character(x),as.character(y))
: }
: 
: packege making is ok , but when I use output in Rgui,  none of object x 
: ory can get the result what I expect(textConnection result),when I copy the 
: code and paste on Rgui ,it is ok.what should I do ?
: 

This is a FAQ:

http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-is-the-output-not-printed-
when-I-source_0028_0029-a-file_003f

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] I modify my question in textconnection output

2005-03-21 Thread Gabor Grothendieck
Michael S michael_shen at hotmail.com writes:

: 
: dear ALL-R-helper:
: I modify my question in textconnection output:
: I wrote one function in Rgui:
: output - function(y){
:   x - textConnection(foo,w)
:   sink(x)
:   a -5
:   b -6
:   z -a*b
:   z
:   e -spss
:   h -c(1,2,3)
:   ls()
:   r-c(s,p,s,s)
:   p-list(1:10)
:   p
:   y - foo
:   sink()
:   close(x)
:   return(y)
: }
: 
: I want to get resulte is :
: y
: 
: [1] [1] 30
: [2]  [1] \a\  \b\  \c\  \d\  \e\  \f\  
: \foo\\g\  \g.p\\h\  \interp\ \m\  
: \mytest\
: [3] [14] \output\ \p\  \r\  \var1\   \var2\   \x\  
: \y\  \z\ 
: [4] [[1]]
: [5]  [1]  1  2  3  4  5  6  7  8  9 10
: [6] 
: 
: when I copy the command line within the function ,and paste to RGui,result 
: is ok .but when I use the output function ,y show value of y object.I got 
: result character(0)
: 
: seem to me : I didn't get  value of y within function

You have not defined foo within your function.  If you have
a foo outside your function then that is being assigned to
y.  If you haven't a foo anywhere then you should have received 
an error.

You might want to look at ?capture.output  

y - capture.output({
  x - 1
  print(x)
})

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Newbie: Matrix indexing

2005-03-22 Thread Gabor Grothendieck
Pascal BLEUYARD p.bleuyard at opgc.univ-bpclermont.fr writes:

: 
: Hi all,
: 
:   I need to compute some occurence matrix: given a zero matrix and a set
: of paired indexes, I want to store the number of occurences of each paired
: index in a matrix. The paired indexes are stores as an index matrix. I
: prefere not to use loops for performances purpose.
: 
:   Here follows a dummy example:
: 
:  occurence - matrix(0, 2, 2); data
:  [,1] [,2]
: [1,]00
: [2,]00
: 
:  index - matrix(1, 3, 2); index
:  [,1] [,2]
: [1,]11
: [2,]11
: [3,]11
: 
:  occurence[index] - occurence[index] + 1
: 
:   I was expecting the folowing result:
: 
:  occurence
:  [,1] [,2]
: [1,]30
: [2,]00
: 
:   I get instead:
: 
:  occurence
:  [,1] [,2]
: [1,]10
: [2,]00
: 
:   I guess that there is some hidden copy involved but I wanted to know if
: there is an efficient workaround (not using some loop structure). I thought
: factors could do the job but I didn't manage to use them for that problem.

Turn your index matrix into a data frame so you can use lapply on it.
Then convert each of the two columns to a two-level factor.  Now you 
can use table on the result:

   table(lapply(as.data.frame(index), factor, lev = 1:2))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] r under linux: creating high quality bmp's for win users

2005-03-22 Thread Gabor Grothendieck

Can you provide a link.  I did a google search and found something
on a Japanese site but it turned out that the writer had made a 
mistake and it linked to wmf2eps, not eps2wmf.

Christophe Pallier pallier at lscp.ehess.fr writes:

: 
: Hello Christoph!
: 
: In the past, I used an utility called eps2wmf.
: It only works under Windows though (maybe under Linux with wine?).
: I believe it is available on the CTAN (Tex archives).
: 
: The nice thing is that wmf files are not bitmap and scale well.
: 
: Christophe Pallier
: 
: Christoph Lehmann wrote:
: 
:  Hi
: 
:  I produce graphics with R under linux, but my collaborators often use 
:  windows and cannot import eps pics e.g. in msword
: 
:  what is the standard way to get e.g. bmp's with the same quality as 
:  eps.  going the way: creating eps, convert eps2bmp using 'convert' 
:  doesn't yield good enough bmp's
: 
:  thanks for a short hint
: 
:  cheers
:  christoph
: 
:  __
:  R-help at stat.math.ethz.ch mailing list
:  https://stat.ethz.ch/mailman/listinfo/r-help
:  PLEASE do read the posting guide! 
:  http://www.R-project.org/posting-guide.html
: 
: __
: R-help at stat.math.ethz.ch mailing list
: https://stat.ethz.ch/mailman/listinfo/r-help
: PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
: 
:

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] extracting numerical data from text field

2005-03-23 Thread Gabor Grothendieck
Luis Tercero luis.tercero at ebi-wasser.uni-karlsruhe.de writes:

: 
: I have imported a data frame that looks like this:
: 
:Measurement.Date.and.Time Z.Average..nm.   PDI
: 572 Dienstag, 22. Mrz 2005 11:05:59  366,4 0,468
: 573 Dienstag, 22. Mrz 2005 11:09:30  353,4 0,532
: 574 Dienstag, 22. Mrz 2005 11:12:59343 0,428
: 575 Dienstag, 22. Mrz 2005 11:16:28  354,1 0,433
: 576 Dienstag, 22. Mrz 2005 11:19:59  341,9 0,349
: 577 Dienstag, 22. Mrz 2005 11:23:29  334,9 0,429
: ...
: 
: Would there be a way to extract the time in numerical form from the
: Measurement.Date.and.Time field?  What I would like to do is a time
: series where, for example,
: Dienstag, 22. Mrz 2005 11:05:59 is time=0 min
: Dienstag, 22. Mrz 2005 11:09:30 is time=3.5 min, etc.
: 
: Thank you in advance for your help.
: 
: Luis

Make sure that you are in a German locale:

  # this works on Windows XP.  On other OS, ge code may differ.
  Sys.setlocale(LC_TIME, ge) 

Then if DF is your data frame use strptime (see ?strptime for more
on the % codes):

  dat - strptime(DF[,1], %A, %d. %B %Y %H:%M:%S)
  dat - dat[1]   # difference in time since the first date time

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] extracting numerical data from text field

2005-03-23 Thread Gabor Grothendieck
Gabor Grothendieck ggrothendieck at myway.com writes:

 
 Luis Tercero luis.tercero at ebi-wasser.uni-karlsruhe.de writes:
 
 : 
 : I have imported a data frame that looks like this:
 : 
 :Measurement.Date.and.Time Z.Average..nm.   PDI
 : 572 Dienstag, 22. Mrz 2005 11:05:59  366,4 0,468
 : 573 Dienstag, 22. Mrz 2005 11:09:30  353,4 0,532
 : 574 Dienstag, 22. Mrz 2005 11:12:59343 0,428
 : 575 Dienstag, 22. Mrz 2005 11:16:28  354,1 0,433
 : 576 Dienstag, 22. Mrz 2005 11:19:59  341,9 0,349
 : 577 Dienstag, 22. Mrz 2005 11:23:29  334,9 0,429
 : ...
 : 
 : Would there be a way to extract the time in numerical form from the
 : Measurement.Date.and.Time field?  What I would like to do is a time
 : series where, for example,
 : Dienstag, 22. Mrz 2005 11:05:59 is time=0 min
 : Dienstag, 22. Mrz 2005 11:09:30 is time=3.5 min, etc.
 : 
 : Thank you in advance for your help.
 : 
 : Luis
 
 Make sure that you are in a German locale:
 
   # this works on Windows XP.  On other OS, ge code may differ.
   Sys.setlocale(LC_TIME, ge) 
 
 Then if DF is your data frame use strptime (see ?strptime for more
 on the % codes):
 
   dat - strptime(DF[,1], %A, %d. %B %Y %H:%M:%S)
   dat - dat[1]   # difference in time since the first date time

One other comment.

I assumed your data time field is stored as character in the data
frame.  If its stored as a factor then you need to convert it to
character first using as.character.  If its already stored as a 
POSIXct date time then all you have to do is subtract off the
first one.  (Note that if you put the output of dput(DF) in your
post then people will be able to exactly recreate your data frame
and then know what you have.)

Also, RNews 4/1 has a table with lots of date time processing
idioms.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Fold in R?

2005-03-27 Thread Gabor Grothendieck
Seung Jun jun at cc.gatech.edu writes:

: 
: Fold in Mathematica (or reduce in Python) works as follows:
: 
: Fold[f, x, {a, b, c}] := f[f[f[x,a],b],c]
: 
: That is, f is a binary operator, x is the initial value, and the results 
: are cascaded along the list.  I've found it useful for reducing lists 
: when I only have a function that accepts two arguments (e.g., merge in R).
: 
: Is there any R equivalent?  I'm a newbie in R and having a hard time 
: finding such one.  Thank you.
: 

You could define it yourself like this:

   Fold - function(f, x, L) for(e in L) x - f(x, e)

   # example of its use
   result - Fold(sum, 0, 1:3)  # result is 6


Note that merge.zoo in the zoo package does handle multiple
arguments; however, that is intended for merging time series
along their times, in case that is your application.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Generating list of vector coordinates

2005-03-28 Thread Gabor Grothendieck
If the odometer order in your post is essential then 
you  could try this:

  expand.grid(1:5, 1:4, 1:3)[,3:1]

If R's reverse odometer order is ok then you could
simplify it to this:

  expand.grid(1:3, 1:4, 1:5)

On Mon, 28 Mar 2005 15:20:46 -0800, Ronnen Levinson [EMAIL PROTECTED] wrote:
 
   Hi.
   Can  anyone  suggest  a  simple  way  to  obtain in R a list of vector
   coordinates of the following form? The code below is Mathematica.
 
 In[5]:=
 Flatten[Table[{i,j,k},{i,3},{j,4},{k,5}], 2]
 Out[5]=
 {{1,1,1},{1,1,2},{1,1,3},{1,1,4},{1,1,5},{1,2,1},{1,2,2},{1,2,3},{1
 ,2,4},{1,2,
 
 5},{1,3,1},{1,3,2},{1,3,3},{1,3,4},{1,3,5},{1,4,1},{1,4,2},{1,4,3},
 {1,4,
 
 4},{1,4,5},{2,1,1},{2,1,2},{2,1,3},{2,1,4},{2,1,5},{2,2,1},{2,2,2},
 {2,2,
 
 3},{2,2,4},{2,2,5},{2,3,1},{2,3,2},{2,3,3},{2,3,4},{2,3,5},{2,4,1},
 {2,4,
 
 2},{2,4,3},{2,4,4},{2,4,5},{3,1,1},{3,1,2},{3,1,3},{3,1,4},{3,1,5},
 {3,2,
 
 1},{3,2,2},{3,2,3},{3,2,4},{3,2,5},{3,3,1},{3,3,2},{3,3,3},{3,3,4},
 {3,3,
 5},{3,4,1},{3,4,2},{3,4,3},{3,4,4},{3,4,5}}
 
   I've  been  futzing with apply(), outer(), and so on but haven't found
   an elegant solution.
   Thanks,
   Ronnen.
   P.S. E-mailed CCs of posted replies appreciated.
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Annotation metadata kills help.search

2005-03-28 Thread Gabor Grothendieck
This happened to me in R 2.1.0 (I forget which specific version
since I now have March 27th) on Windows XP which I traced to
package dataRep.  Once I removed that package help.search 
worked again.


On Mon, 28 Mar 2005 22:08:58 -0500, Gerard Tromp
[EMAIL PROTECTED] wrote:
 Greetings!
 
 OS: Windows
 R 2.0.1
 
 Before anyone flames -- I tried to query this on the R searchable web site
 and using google and did not find anything applicable.
 
 As of about a week ago the help.search function dies when used in the simple
 help.search(something) usage.
 The error is
 Error in rbind(...) : number of columns of matrices must match (see arg 203)
 
 After some effort I have traced it down to the annotation packages. I
 installed
 GO, KEGG, mgu74[abc]v2 and hgu133plus2 all version 1.7.0
 
 When I move these out of the library directory, help.search() functions
 correctly again.
 
 I have not tracked it any further -- just wanted to know if anyone else had
 noticed it.
 
 Gerard Tromp
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Recall() and sapply()

2005-03-30 Thread Gabor Grothendieck
I believe that the function that Recall executes is the
function in which Recall, itself, is evaluated -- not the
function in which Recall appears.  In normal cases these are
the same but if you pass Recall to another function then
they are not the same.  Here Recall is being passed to
sapply (which in turn is likely passing it to other
functions).  Because of lazy evaluation Recall does not get
evaluated until it is found within sapply (or a function
called by it or called by one called by it, etc.) and at
that point its recalling the wrong function.  AFAICS one
cannot pass Recall to another function.

You could rewrite the expression that uses sapply to use 
iteration instead or you could do it as shown below.  
In this example, the use of f2 within supply refers to 
the inner f2 which does not change even if the name of 
the outer f2 does.

  f2 - function(n) {
 f2 - function(n) if(length(n)1) sapply(n,f2) else matrix(n,n,n)
 f2(n)
   }
   f3 - f2
   f2(1:3)
   f3(1:3)  # gives same result



On Wed, 30 Mar 2005 09:28:08 +0100, Robin Hankin
[EMAIL PROTECTED] wrote:
 Hi.
 
 I'm having difficulty following the advice given in help(Recall).
 Consider the two
 following toy functions:
 
 f1 - function(n){
   if(length(n)1){return(sapply(n,f1))}
   matrix(n,n,n)
 }
 
 f2 - function(n){
   if(length(n)1){return(sapply(n,Recall))}
   matrix(n,n,n)
 }
 
 f1() works as desired (that is, f(1:3), say, gives me a three element
 list whose i-th element
 is an i-by-i matrix whose elements are all i).
 
 But f2() doesn't.
 
 How do I modify either function to use Recall()?  What exactly is
 Recall() calling here?
 
 --
 Robin Hankin
 Uncertainty Analyst
 Southampton Oceanography Centre
 European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Survey of moving window statistical functions - still looking f or fast mad function

2005-04-01 Thread Gabor Grothendieck
Jaroslaw's article was great.  In fact, it was used as the basis for 
rapply and some optimized special cases that will be included in
the R 2.1.0 version of zoo (which have been coded but not yet
released).

Regarding numerically stable summation, check out the idea 
behind the following which I coincidentally am also considering 
for the zoo implementation:

   http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/393090

On Apr 1, 2005 8:07 PM, Vadim Ogranovich [EMAIL PROTECTED] wrote:
 Hi,
 
 First, let me thank Jaroslaw for making this survey. I find it quite
 illuminating.
 
 Now the questions:
 
 * the #1 solution below (based on cumsum) is numerically unstable.
 Specifically if you do the runmean on a positive vector you can easily
 get negative numbers due to rounding errors. Does anyone see a
 modification which is free of this deficiency?
 * is it possible to optimize the algorithm of the filter function,
 solution #2 below, for the case of the  rep(1/k,k) kernel?
 
 Thanks,
 Vadim
 
 [R] Survey of moving window statistical functions - still looking f or
 fast mad function
 
 *   This message: [ Message body
 http://tolstoy.newcastle.edu.au/R/help/04/10/5161.html#start  ] [ More
 options
 http://tolstoy.newcastle.edu.au/R/help/04/10/5161.html#options2  ]
 *   Related messages: [ Next message
 http://tolstoy.newcastle.edu.au/R/help/04/10/5162.html  ] [ Previous
 message http://tolstoy.newcastle.edu.au/R/help/04/10/5160.html  ] [
 Next in thread http://tolstoy.newcastle.edu.au/R/help/04/10/5167.html
 ] [ Replies
 http://tolstoy.newcastle.edu.au/R/help/04/10/5161.html#replies  ]
 
 From: Tuszynski, Jaroslaw W. JAROSLAW.W.TUSZYNSKI_at_saic.com
 mailto:JAROSLAW.W.TUSZYNSKI_at_saic.com?Subject=Re:%20%5BR%5D%20Survey%
 20of%20quot;moving%20windowquot;%20statistical%20functions%20-%20still
 %20lookingf%20or%20fast%20mad%20function 
 Date: Sat 09 Oct 2004 - 06:30:32 EST
 
 Hi,
 
 Lately I run into a problem that my code R code is spending hours
 performing simple moving window statistical operations. As a result I
 did searched archives for alternative (faster) ways of performing: mean,
 max, median and mad operation over moving window (size 81) on a vector
 with about 30K points. And performed some timing for several ways that
 were suggested, and few ways I come up with. The purpose of this email
 is to share some of my findings and ask for more suggestions (especially
 about moving mad function).
 
 Sum over moving window can be done using many different ways. Here are
 some sorted from the fastest to the slowest:
 
 1.  runmean = function(x, k) { n = length(x) y = x[ k:n ] - x[
 c(1,1:(n-k)) ] # this is a difference from the previous cell y[1] =
 sum(x[1:k]); # find the first sum y = cumsum(y) # apply precomputed
 differences return(y/k) # return mean not sum }
 2.  filter(x, rep(1/k,k), sides=2, circular=T) - (stats package)
 3.  kernapply(x, kernel(daniell, m), circular=T)
 4.  apply(embed(x,k), 1, mean)
 5.  mywinfun - function(x, k, FUN=mean, ...) { # suggested in news
 group n - length(x) A - rep(x, length=k*(n+1)) dim(A) - c(n+1, k)
 sapply(split(A, row(A)), FUN, ...)[1:(n-k+1)] }
 6.  rollFun(x, k, FUN=mean) - (fSeries package)
 7.  rollMean(x, k) - (fSeries package)
 8.  SimpleMeanLoop = function(x, k) { n = length(x) # simple-minded
 loop used as a baseline y = rep(0, n) k = k%/%2; for (i in (1+k):(n-k))
 y[i] = mean(x[(i-k):(i+k)]) }
 9.  running(x, fun=mean, width=k) - (gtools package)
 
 Some of above functions return results that are the same length as x and
 some return arrays with length n-k+1. The relative speeds (on Windows
 machine) were as follow: 0.01, 0.09, 1.2, 8.1, 11.2, 13.4, 27.3, 63,
 345. As one can see there are about 5 orders of magnitude between the
 fastest and the slowest.
 
 Maximum over moving window can be done as follow, in order of speed
 
 1.  runmax = function(x, k) { n = length(x) y = rep(0, n) m = k%/%2;
 a = 0; for (i in (1+m):(n-m)) { if (a==y[i-1]) y[i] =
 max(x[(i-m):(i+m)]) # calculate max of the window else y[i] =
 max(y[i-1], x[i+m]); # max of the window is =y[i-1] a = x[i-m] # point
 that will be removed from the window } return(y) }
 2.  apply(embed(x,k), 1, max)
 3.  SimpleMaxLoop(x, k) - similar to SimpleMeanLoop above
 4.  mywinfun(x, k, FUN=max) - see above
 5.  rollFun(x, k, FUN=max) - fSeries package
 6.  rollMax(x, k) - fSeries package
 7.  running(x, fun=max, width=k) - gtools package The relative
 speeds were: 0.01, 3, 3.4, 5.3, 7.5, 7.7, 15.3
 
 Median over moving window can be done as follows:
 
 1.  runmed(x, k) - from stats package
 2.  SimpleMedLoop(x, k) - similar to SimpleMeanLoop above
 3.  apply(embed(x,k), 1, median)
 4.  mywinfun(x, k, FUN=median) - see above
 5.  rollFun (x, k, FUN=median) - fSeries package
 6.  running(x, fun=max, width=k) - gtools package Speeds: 0.01,
 3.4, 9, 15, 29, 165
 
 Mad over moving window can be done as 

Re: [R] factor to numeric in data.frame

2005-04-02 Thread Gabor Grothendieck
Try this:

data.matrix(df.f12)

On Apr 2, 2005 6:01 AM, Heinz Tuechler [EMAIL PROTECTED] wrote:
 Dear All,
 
 Assume I have a data.frame that contains also factors and I would like to
 get another data.frame containing the factors as numeric vectors, to apply
 functions like sapply(..., median) on them.
 I read the warning concerning as.numeric or unclass, but in my case this
 makes sense, because the factor levels are properly ordered.
 I can do it, if I write for each single column unclass(...), but I would
 like to use indexing, e.g. unclass(df[1:10]).
 Is that possible?
 
 Thanks,
 Heinz Tüchler
 
 ## Example:
 f1 - factor(c(rep('c1-low',2),rep('c2-med',5),rep('c3-high',3)))
 f2 - factor(c(rep('c1-low',5),rep('c2-low',3),rep('c3-low',2)))
 df.f12 - data.frame(f1,f2) # data.frame containing factors
 
 ## this does work
 df.f12.num - data.frame(unclass(df.f12[[1]]),unclass(df.f12[[2]]))
 df.f12.num
 ## this does not work
 df.f12.num - data.frame(unclass(df.f12[[1:2]]))
 df.f12.num
 ## this does not work
 df.f12.num - data.frame(unclass(df.f12[1:2]))
 df.f12.num
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] extract date

2005-04-05 Thread Gabor Grothendieck
I just started using gmail and one thing that I thought would
be annoying but sometimes is actually interesting are the
ads at the right hand side.  They are keyed off the content
of the email and in the case of your post produced:

http://www.visibone.com/regular-expressions/?via=google120

http://www.regexpbuddy.com

The first one is advertising a javascript reference card (which
I happen to own and is excellent); but in any case, the contents 
of the regexp part of the reference card are fully reproduced on 
the web page and includes dozens of examples of regexps that 
you could try.  I haven't explored the other web site.

Although I have not read it, there is a book called Mastering
Regular Expressions.

By the way, here is an alternative to calculating nd in Prof.
Riley's post just to give you something else to play with. 
I think I prefer his solution but this one is arguably a bit
simpler.  The three portions separated by the two  bars
are each deleted if they are present.  gsub causes it
to repeatedly try them so that it does not stop after
deleting the first one:

nd - gsub(Date: |.*, | ..:.*$, , dates)

On Apr 5, 2005 7:22 AM, Petr Pikal [EMAIL PROTECTED] wrote:
 Dear Prof.Ripley
 
 Thank you for your answer. After some tests and errors I finished
 with suitable extraction function which gives me substatnial
 increase in positive answers.
 
 Nevertheless I definitely need to gain more practice in regular
 expressions, but from the help page I can grasp only easy things. Is
 there any Regular expressions for dummies available?
 
 Best regards
 Petr Pikal
 
 On 5 Apr 2005 at 10:23, Prof Brian Ripley wrote:
 
  On Tue, 5 Apr 2005, Petr Pikal wrote:
 
   Dear all,
  
   please, is there any possibility how to extract a date from data
   which are like this:
 
  Yes, if you delimit all the possibilities.
 
   
   Date: Sat, 21 Feb 04 10:25:43 GMT
   Date: 13 Feb 2004 13:54:22 -0600
   Date: Fri, 20 Feb 2004 17:00:48 +
   Date: Fri, 14 Jun 2002 16:22:27 -0400
   Date: Wed, 18 Feb 2004 08:53:56 -0500
   Date: 20 Feb 2004 02:18:58 -0600
   Date: Sun, 15 Feb 2004 16:01:19 +0800
   
  
   I used
  
   strptime(paste(substr(x,12,13), substr(x,15,17), substr(x,19,22),
   sep=-), format=%d-%b-%Y)
  
   which suits to lines 3:5 and 7 (such are the most common in my
   dataset) but obviously does not work with other lines.
 
  For those examples, in character vector 'dates' (without quotes):
 
   nd - gsub(^[^0-9]*([0-9]+) ([A-Za-z]+) ([0-9]+).*,
\\1 \\2 \\3, dates)
   strptime(nd, %d %b %y)
  [1] 2004-02-21 2020-02-13 2020-02-20 2020-06-14 2020-02-18
  [6] 2020-02-20 2020-02-15
 
  You should be able to amend the regexp for a wider range of forms, but
  your first line is ambiguous (2004 or 2021?) so there are limits.
 
   If there is no stightforward solution I can live with what I use now
   but some automagical function like
  
   give.me.date.from.my.string.regardles.of.formating(x)
   would be great.
 
  It would be impossible: when Americans write 07/04/2004 they do not
  mean April 7th.
 
  --
  Brian D. Ripley,  [EMAIL PROTECTED]
  Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
  University of Oxford, Tel:  +44 1865 272861 (self) 1 South
  Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG,
  UKFax:  +44 1865 272595
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 Petr Pikal
 [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] lists: removing elements, iterating over elements,

2005-04-05 Thread Gabor Grothendieck
On Apr 5, 2005 1:36 PM, Paul Johnson [EMAIL PROTECTED] wrote:
 I'm writing R code to calculate Hierarchical Social Entropy, a diversity
 index that Tucker Balch proposed.  One article on this was published in
 Autonomous Robots in 2000. You can find that and others through his web
 page at Georgia Tech.
 
 http://www.cc.gatech.edu/~tucker/index2.html
 
 While I work on this, I realize (again) that I'm a C programmer
 masquerading in R, and its really tricky working with R lists.  Here are
 things that surprise me, I wonder what your experience/advice is.
 
 I need to calculate overlapping U-diametric clusters of a given radius.
   (Again, I apologize this looks so much like C.)
 
 ## Returns a list of all U-diametric clusters of a given radius
 ## Give an R distance matrix
 ## Clusters may overlap.  Clusters may be identical (redundant)
 getUDClusters -function(distmat,radius){
   mem - list()
 
   nItems - dim(distmat)[1]
   for ( i in 1:nItems ){
 mem[[i]] - c(i)
   }
 
   for ( m in 1:nItems ){
 for ( n in 1:nItems ){
   if (m != n  (distmat[m,n] = radius)){
##item is within radius, so add to collection m
 mem[[m]] - sort(c( mem[[m]],n))
   }
 }
   }
 
   return(mem)
 }
 
 That generates the list, like this:
 
 [[1]]
 [1]  1  3  4  5  6  7  8  9 10
 
 [[2]]
 [1]  2  3  4 10
 
 [[3]]
 [1]  1  2  3  4  5  6  7  8 10
 
 [[4]]
 [1]  1  2  3  4 10
 
 [[5]]
 [1]  1  3  5  6  7  8  9 10
 
 [[6]]
 [1]  1  3  5  6  7  8  9 10
 
 [[7]]
 [1]  1  3  5  6  7  8  9 10
 
 [[8]]
 [1]  1  3  5  6  7  8  9 10
 
 [[9]]
 [1]  1  5  6  7  8  9 10
 
 [[10]]
  [1]  1  2  3  4  5  6  7  8  9 10
 
 The next task is to eliminate the redundant elements.  unique() does not
 apply to lists, so I have to scan one by one.
 
   cluslist - getUDClusters(distmat,radius)
 
   ##find redundant (same) clusters
   redundantCluster - c()
   for (m in 1:(length(cluslist)-1)) {
 for ( n in (m+1): length(cluslist) ){
   if ( m != n  length(cluslist[[m]]) == length(cluslist[[n]]) ){
 if ( sum(cluslist[[m]] == cluslist[[n]]){
   redundantCluster - c( redundantCluster,n)
 }
   }
 }
   }
 
   ##make sure they are sorted in reverse order
   if (length(redundantCluster)0)
 {
   redundantCluster - unique(sort(redundantCluster, decreasing=T))
 
   ## remove redundant clusters (must do in reverse order to preserve
 index of cluslist)
   for (i in redundantCluster) cluslist[[i]] - NULL
 }
 
 Question: am I deleting the list elements properly?
 
 I do not find explicit documentation for R on how to remove elements
 from lists, but trial and error tells me
 
 myList[[5]] - NULL
 
 will remove the 5th element and then close up the hole caused by
 deletion of that element.  That suffles the index values, So I have to
 be careful in dropping elements. I must work from the back of the list
 to the front.
 
 Is there an easier or faster way to remove the redundant clusters?
 
 Now, the next question.  After eliminating the redundant sets from the
 list, I need to calculate the total number of items present in the whole
 list, figure how many are in each subset--each list item--and do some
 calculations.
 
 I expected this would iterate over the members of the list--one step for
 each subcollection
 
 for (i in cluslist){
 
 }
 
 but it does not.  It iterates over the items within the subsets of the
 list cluslist.  I mean, if cluslist has 5 sets, each with 10 elements,
 this for loop takes 50 steps, one for each individual item.
 
 I find this does what I want
 
 for (i in 1:length(cluslist))
 
 But I found out the hard way :)
 
 Oh, one more quirk that fooled me.  Why does unique() applied to a
 distance matrix throw away the 0's  I think that's really bad!
 
  x - rnorm(5)
  myDist - dist(x,diag=T,upper=T)
  myDist
   1 2 3 4 5
 1 0.000 1.2929976 1.6658710 2.6648003 0.5494918
 2 1.2929976 0.000 0.3728735 1.3718027 0.7435058
 3 1.6658710 0.3728735 0.000 0.9989292 1.1163793
 4 2.6648003 1.3718027 0.9989292 0.000 2.1153085
 5 0.5494918 0.7435058 1.1163793 2.1153085 0.000
  unique(myDist)
  [1] 1.2929976 1.6658710 2.6648003 0.5494918 0.3728735 1.3718027 0.7435058
  [8] 0.9989292 1.1163793 2.1153085
 
 
 --

If L is our list of vectors then the following gets the unique
elements of L.  

I have assumed that the individual vectors are sorted 
(sort them first if not via lapply(L, sort)) and that each element 
has a unique name (give it one if not, e.g. names(L) - seq(L)).

The first line binds them together into rows.   This will
recycle to make them the same length and give you a warning 
but that's ok since you only need to know if they are the same or 
not.  Now, unique applied to a matrix finds the unique rows and 
in the third line we use the row.names from that to get the original 
unsorted lists.

   mat - unique(do.call(rbind, L))
   L[row.names(mat)]

Regarding why the diagonal elements of a distance 

Re: [R] Introduce a new function in a package?

2005-04-06 Thread Gabor Grothendieck
Some other advantages of making your own package are:

- you can use help.search to search for your own functions even if you
  don't load the package

- if you can't even remember where your functions are (and I often
  can't) then you may not remember what they do either and packaging
  them gives a convenient way to associate documentation.  Once you
  have found your function you can use ? to gets its documentation.

- you get to use ' CMD check' whch is very helpful

If you are doing it on Windows the amount of software you need to
download and install first may be a bit offputting and you may need
to sort out some path and latex problems but its probably worth it
in the end if you do enough R development.

On Apr 6, 2005 10:55 AM, Don MacQueen [EMAIL PROTECTED] wrote:
 Expressions in .Rprofile are executed *before* any previously saved
 global environment is loaded (i.e., before the .RData file in the
 current working directory is loaded, causing the message 
 [Previously saved workspace restored] to a appear).
 
 If you define a function in .Rprofile, and then later answer yes to
 the Save workspace image? question when you quit R, the function
 will exist in the saved workspace.
 
 When you next start R, the version that comes in from .Rprofile will
 be replaced by the version in the saved workspace -- because the
 saved workspace is loaded after .Rprofile is executed.
 
 This means that if you decide to change the function in .Rprofile,
 your changes will immediately be lost when the previously saved
 workspace is loaded, since that has the previous version.
 
 So defining personal utility functions in .Rprofile is not very
 effective. Much, much, better to create a package, and then require()
 that package in .Rprofile. And since creating a package is really
 very easy, I strongly recommend that option.
 
 Saving the functions in an image file and then attaching it is fine,
 but less convenient, in my opinion, since you have to keep track of
 where it is in the file system.
 
 -Don
 
 At 4:09 PM +0100 4/6/05, Jan T. Kim wrote:
 On Wed, Apr 06, 2005 at 09:57:00AM -0400, Roger D. Peng wrote:
   I think the usual way is to create an R package for yourself and load
   it when you need it for whatever project.
 
   -roger
 
 Alternatively, one can also write the function in question into one's
 ~/.Rprofile; then, it's automatically available in all R sessions.
 To avoid confusion, make sure that you choose a unique name, i.e. one
 that isn't used by any package, if possible.
 
 This method should be used only for functions intended to provide some
 convenience in interactive sessions, code in scripts should not rely
 on functions being provided by ~/.Rprofile. For scripting, an R package
 is definitely preferred.
 
 Best regards, Jan
 
   Luis Ridao Cruz wrote:
   R-help,
   
   Sometimes I define functions I wish to have in any R session.
   The obvious thing to do is copy-paste the code
   The thing is that sometimes I don't know where I have the function
   code.
   
   My question is if somehow I could define a function and introduce it
   (let's say 'base' package ) so that
   could be used anytime I run a different R project.
   
   Thank you in advance
   
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
   
 
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
 
 --
   +- Jan T. Kim ---+
   |*NEW*email: [EMAIL PROTECTED]   |
   |*NEW*WWW:   http://www.cmp.uea.ac.uk/people/jtk |
   *-=  hierarchical systems are for files, not for humans  =-*
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 
 --
 --
 Don MacQueen
 Environmental Protection Department
 Lawrence Livermore National Laboratory
 Livermore, CA, USA
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Is a .R script file name available inside the script?

2005-04-06 Thread Gabor Grothendieck
It works for me.  Suppose in.txt is a two line file with these two lines:

file - Rscript.R
source(file)

and Rscript.R is a two line file with these two lines:

script.description - function() eval.parent(quote(file), n = 3)
print(basename(script.description()))

Then here is the output on Windows:

C:\Program Files\R\rw2001beta\binR --vanilla  in.txt

R : Copyright 2004, The R Foundation for Statistical Computing
[snip]
 file - Rscript.R
 source(file)
[1] Rscript.R

Note that 'file' referred to in 'eval.parent' is not the variable that
you called 'file' but is an internal variable within the 'source'
program that is called 'file'.  It has nothing to do with your 'file',
which very well could have a different name.  In fact you
just do this on Windows:

  echo source(Rscript.R)  | R --vanilla

From:   Darren Weber [EMAIL PROTECTED]

That is useful, when calling the script like this:

 file - Rscript.R
 source(file)

However, it does not work if we do this from the shell prompt:

$ R --vanilla  Rscript.R

because the eval.parent statement attempts to access a base
workspacethat does not contain the file object/variable, as above.
Isthere a solution for this situation?  Is the input script file
anargument to R and therefore available in something like argv?

On Mar 18, 2005 8:00 PM, Gabor Grothendieck [EMAIL PROTECTED] wrote:
Darren Weber darrenleeweber at gmail.com writes:

:
: Hi,
:
: if we have a file called Rscript.R that contains the following, for example:
:
: x - 1:100
: outfile = Rscript.Rout
: sink(outfile)
: print(x)
:
: and then we run
:
:  source(Rscript.R)
:
: we get an output file called Rscript.Rout - great!
:
: Is there an internal variable, something like .Platform, that holds
: the script name when it is being executed?  I would like to use that
: variable to define the output file name.
:

In R 2.0.1 try putting this in a file and sourcing it.

script.description - function() eval.parent(quote(file), n = 3)
print(basename(script.description()))

If you are using R 2.1.0 (devel) then use this instead:

script.description - function()
   showConnections() [as.character(eval.parent(quote(file), n = 3)),
   description]
print((basename(script.description(

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to do aggregate operations with non-scalar functions

2005-04-07 Thread Gabor Grothendieck
On Apr 7, 2005 1:18 AM, Itay Furman [EMAIL PROTECTED] wrote:
 
 On Tue, 5 Apr 2005, Gabor Grothendieck wrote:
 
  On Apr 5, 2005 6:59 PM, Itay Furman [EMAIL PROTECTED] wrote:
 
  Hi,
 
  I have a data set, the structure of which is something like this:
 
  a - rep(c(a, b), c(6,6))
  x - rep(c(x, y, z), c(4,4,4))
  df - data.frame(a=a, x=x, r=rnorm(12))
 
  The true data set has 1 million rows. The factors a and x
  have about 70 levels each; combined together they subset 'df'
  into ~900 data frames.
  For each such subset I'd like to compute various statistics
  including quantiles, but I can't find an efficient way of
 
 [snip]
 
  I would like to end up with a data frame like this:
 
a x 0%25%
  1 a x -0.7727268  0.1693188
  2 a y -0.3410671  0.1566322
  3 b y -0.2914710 -0.2677410
  4 b z -0.8502875 -0.6505710
 
 [snip]
 
  One can use
 
do.call(rbind, by(df, list(a = a, x = x), f))
 
  where f is the appropriate function.
 
  In this case f can be described in terms of df.quantile which
  is like quantile except it returns a one row data frame:
 
df.quantile - function(x,p)
as.data.frame(t(data.matrix(quantile(x, p
 
f - function(df, p = c(0.25, 0.5))
cbind(df[1,1:2], df.quantile(df[,r], p))
 
 
 Thanks!  Just what I wanted.
 
 A minor point is that for some reason the row numbers in the
 final data frame are not sequential (see below -- this is not a
 consequence of my changes).

These are the original row numbers of the first row of
each combo of a and x.  If z is the result of do.call
you can always do this:   row.names(z) - 1:nrow(z)
if this its needed.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] package

2005-04-07 Thread Gabor Grothendieck
On Apr 7, 2005 8:43 AM, Gregory BENMENZER
[EMAIL PROTECTED] wrote:
 hello,
 
 I created a package with my functions, and i wand to hide the code of some 
 functions.
 
 Could you help me ?
 
 Grégory

There was some discussion on the list that there is work being
done on an R compiler.  I don't know what the status is or whether
it would indeed solve your problem but you could try googling
around for it or maybe someone else on the list can provide
more info.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] NA in table with integer types

2005-04-08 Thread Gabor Grothendieck
On Apr 8, 2005 9:05 AM, Paul Rathouz [EMAIL PROTECTED] wrote:
 
 OK.  Thanks.  So, if you use table() on a factor that contains NA's, but
 for which NA is not a level, is there any way to get table to generate an
 entry for the NAs?  For example, in below, even exclude=NULL will not
 give me an entry for NA on the factor y:
 
  x - c(1,2,3,3,NA)
  y - factor(x)
  y
 [1] 1233NA
 Levels: 1 2 3

summary(y)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] the difference between UseMethod and NextMehod?

2005-04-10 Thread Gabor Grothendieck
ronggui 0034058 at fudan.edu.cn writes:

: hi,usRs,i am studing the R programming,but i can not get the point abut the 
difference between UseMethod
: and NextMehod.i have read the manual and try to find the solutin from 
internet,but i still not master it
: well.so anyone can give me a guide?it will be better to show some examples .


One normally uses UseMethod within a generic function to dispatch
the appropriate method while NextMethod is normally used within the 
function so dispatched.

An important difference is that UseMethod does not return, i.e.
statements after UseMethod are not evaluated, whereas NextMethod 
does return.

Have a look at print and print.ts for examples of UseMethod and
NextMethod, respectively.  Just type the following at the R prompt:

print
print.ts

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to change letters after space into capital letters

2005-04-11 Thread Gabor Grothendieck
On Apr 11, 2005 6:22 AM, Wolfram Fischer [EMAIL PROTECTED] wrote:
 What is the easiest way to change within vector of strings
 each letter after a space into a capital letter?
 
 E.g.:
  c( this is an element of the vector of strings, second element )
 becomes:
  c( This Is An Element Of The Vector Of Strings, Second Element )
 
 My reason to try to do this is to get more readable abbreviations.
 (A suggestion would be to add an option to abbreviate() which changes
 letters after space to uppercase letters before executing the abbreviation
 algorithm.)
 

Look for the thread titled

  String manipulation---mixed case

in the r-help archives.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Sweave and abbreviating output from R

2005-04-11 Thread Gabor Grothendieck
On Apr 11, 2005 7:22 AM, Gavin Simpson [EMAIL PROTECTED] wrote:
 Dear List,
 
 I'm using Sweave to produce a series of class handouts for a course I am
 running. The students in previous years have commented about wanting
 output within the handouts so they can see what to expect the output to
 look like. So Sweave is a godsend for producing this type of handout -
 with one exception: Is there a way to suppress some rows of printed
 output so as to save space in the printed documentation? E.g
 
 rnorm(100)
 
 produces about 20 lines of output (depending on options(width)). I'd
 prefer something like:
 
 rnorm(100)
   [1]  0.527739021  0.185551107 -1.239195562  0.020991608 -1.225632520
   [6] -1.000243373 -0.020180393  2.552180776 -1.719061533 -0.195024625
 ...
   [96] -0.744916379  0.863733400 -0.186667848  1.378236663 -0.499201046
 
 The actual application would be printing of output from summary()
 methods. Ideally it would be nice to ask for line 1-10, 30-40, 100-102,
 for example, so you could print the first few lines of several sections
 of output. I'd like to automate this so I don't need to keep copying and
 pasting into the final tex source or forget to do it if I alter some
 previous part of the Sweave source.
 
 Has anyone tried to do this? Does anyone know of an automatic way of
 achieving the simple abbreviation or the more complicated version I
 described?
 
 Any thoughts on this?
 

Maybe you could use head(rnorm(100)) instead.  Check ?head
for other arguments.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] the difference between x1 and x1

2006-04-20 Thread Gabor Grothendieck
Try this:

# test data
set.seed(1)
DF - data.frame(x1 = rnorm(5), x2 = rnorm(5), x3 = rnorm(5))
DF
model.list - c(x2, x3)

# transform
for(v in model.list) DF[v] - floor(DF[v])

On 4/20/06, Chad Reyhan Bhatti [EMAIL PROTECTED] wrote:
 Hello,

 I am not sure what to write in the subject line, but I would like to take
 a character string that is a variable in a data frame and apply a function
 that takes a numeric argument to this character string.

 Here is a simplified example that would solve my problem.
 Imagine I have my data stored in a data frame.
  x1 - x2 - x3 - x4 - x5 - rnorm(20,0,1);
  data - as.data.frame(cbind(x1,x2,x3,x4,x5));

 I have a vector containing the variables of interest as such.
  model.list - c(x1,x3,x4);

  model.list[1]
 [1] x1

 I would like to loop through this vector and apply the floor() function to
 each variable.  In the current form the elements of model.list do not
 represent the variables in the data frame.

  floor(model.list[1])
 Error in floor(model.list[1]) : Non-numeric argument to mathematical
 function

  floor(eval(model.list[1]))
 Error in floor(eval(model.list[1])) : Non-numeric argument to mathematical
 function

  s - expression(paste(floor(,model.list[1],),sep=))
  s
 expression(paste(floor(, model.list[1], ), sep = ))
  eval(s)
 [1] floor(x1)
 

 I have tried the obvious (to me) without success.  Perhaps someone could
 suggest a
 solution and some tidbits for me to read up on about the how and why.

 Thanks,

 Chad R. Bhatti

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Considering port of SAS application to R

2006-04-21 Thread Gabor Grothendieck
R supports a number of databases and if you only need to work with a small
amount of data at once it should be readily do-able; however, R keeps objects
in memory and if you need large amounts at once then you could run into
problems.  Note that S-Plus keeps objects on disk and has other
features aimed at large data and might be an alternative if R cannot handle
the size and you want something based on the S language.

Since SAS was developed many years ago when optimizing computer
resources was more important than it is now it might be difficult to find
an alternative that matches it for performance with large data sets.

You probably want to quickly develop the core of your app in such a way
that it has the main performance characteristics of the full app so you
can get an idea of whether it will work prior to spending the time on the
full code.

Also note that R typically processes matrices faster than data frames
and, in general, how you write your application may affect its performance.

On 4/21/06, Werner Wernersen [EMAIL PROTECTED] wrote:
 Hi there!

 I am considering to port a SAS application to R and I would like to hear your 
 opinion if you think this is possible and worthwhile. SAS is mainly used to 
 do data management and then to do some aggregations and simple computations 
 on the data and to output a modified data set. The main problem I see is the 
 size of the data file. As I have no access to SAS yet I cannot give real 
 details but the SAS data file is about 7 gigabytes large. (It's only the 
 basic SAS system without any additional modules)

 What do you think, would a port to R be possible with reasonable effort? Is R 
 able to handle that size of data? Or is R prepared to work together with some 
 database system?

 Thanks for your thoughts!

 Best regards,
  Werner


 -

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Creat new column based on condition

2006-04-21 Thread Gabor Grothendieck
Try:

V1 - matrix(c(10, 20, 30, 10, 10, 20), nc = 1)

V2 - 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30)

or

V2 - matrix(c(4, 6, 10)[V1/10], nc = 1)

On 4/21/06, Sachin J [EMAIL PROTECTED] wrote:
 Hi,

  How can I accomplish this task in R?

V1
10
20
30
10
10
20

  Create a new column V2 such that:
  If V1 = 10 then V2 = 4
  If V1 = 20 then V2 = 6
  V1 =   30 then V2 = 10

  So the O/P looks like this

V1  V2
10   4
20   6
30  10
10   4
10   4
20   6

  Thanks in advance.

  Sachin

 __



[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Creat new column based on condition

2006-04-21 Thread Gabor Grothendieck
DF - data.frame(V1 = c(10, 20, 30, 10, 10, 20))
DF$V2 - with(DF, 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30))
DF$V3 - c(4, 6, 10)[DF$V1/10]

or

DF - data.frame(V1 = c(10, 20, 30, 10, 10, 20))
DF - transform(DF, V2 = 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30),
  V3 = c(4, 6, 10)[V1/10])

On 4/21/06, Sachin J [EMAIL PROTECTED] wrote:

 Hi Gabor,

 The first one works fine. Just out of curiosity, in second solution: I dont
 want to create a matrix. I want to add a new column to the existing
 dataframe (i.e. V2 based on the values in V1). Is there a way to do it?

 TIA
 Sachin



 Gabor Grothendieck [EMAIL PROTECTED] wrote:

 Try:

 V1 - matrix(c(10, 20, 30, 10, 10, 20), nc = 1)

 V2 - 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30)

 or

 V2 - matrix(c(4, 6, 10)[V1/10], nc = 1)

 On 4/21/06, Sachin J wrote:
  Hi,
 
  How can I accomplish this task in R?
 
  V1
  10
  20
  30
  10
  10
  20
 
  Create a new column V2 such that:
  If V1 = 10 then V2 = 4
  If V1 = 20 then V2 = 6
  V1 = 30 then V2 = 10
 
  So the O/P looks like this
 
  V1 V2
  10 4
  20 6
  30 10
  10 4
  10 4
  20 6
 
  Thanks in advance.
 
  Sachin
 
  __
 
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html
 




 
 Love cheap thrills? Enjoy PC-to-Phone calls to 30+ countries for just 2¢/min
 with Yahoo! Messenger with Voice.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Creat new column based on condition

2006-04-21 Thread Gabor Grothendieck
Here is a compact solution using approx:

  DF$V2 - approx(c(10, 20, 30), c(4,6,10), DF$V1)$y


On 4/21/06, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 DF - data.frame(V1 = c(10, 20, 30, 10, 10, 20))
 DF$V2 - with(DF, 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30))
 DF$V3 - c(4, 6, 10)[DF$V1/10]

 or

 DF - data.frame(V1 = c(10, 20, 30, 10, 10, 20))
 DF - transform(DF, V2 = 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30),
  V3 = c(4, 6, 10)[V1/10])

 On 4/21/06, Sachin J [EMAIL PROTECTED] wrote:
 
  Hi Gabor,
 
  The first one works fine. Just out of curiosity, in second solution: I dont
  want to create a matrix. I want to add a new column to the existing
  dataframe (i.e. V2 based on the values in V1). Is there a way to do it?
 
  TIA
  Sachin
 
 
 
  Gabor Grothendieck [EMAIL PROTECTED] wrote:
 
  Try:
 
  V1 - matrix(c(10, 20, 30, 10, 10, 20), nc = 1)
 
  V2 - 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30)
 
  or
 
  V2 - matrix(c(4, 6, 10)[V1/10], nc = 1)
 
  On 4/21/06, Sachin J wrote:
   Hi,
  
   How can I accomplish this task in R?
  
   V1
   10
   20
   30
   10
   10
   20
  
   Create a new column V2 such that:
   If V1 = 10 then V2 = 4
   If V1 = 20 then V2 = 6
   V1 = 30 then V2 = 10
  
   So the O/P looks like this
  
   V1 V2
   10 4
   20 6
   30 10
   10 4
   10 4
   20 6
  
   Thanks in advance.
  
   Sachin
  
   __
  
  
  
   [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
  
 
 
 
 
  
  Love cheap thrills? Enjoy PC-to-Phone calls to 30+ countries for just 2¢/min
  with Yahoo! Messenger with Voice.
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Minor documentation issue

2006-04-21 Thread Gabor Grothendieck
It is matched by the first argument which is called from even though,
as the documentation indicates under the explanation of the last
form, it refers to the ending value.  Note, for example, that
seq(from = 3) gives 1:3 and not 3:1.

Also the help file does say:

The interpretation of the unnamed arguments of 'seq' is _not_
 standard, ...

On 4/21/06, Vivek Satsangi [EMAIL PROTECTED] wrote:
 (Sorry about the last email which was incomplete. I hit 'send' accidentally).

 I looked at ?seq. One of the forms given under Usage is seq(from).
 This would be the form used if seq is called with only one argument.
 However, this should actually say seq(to). For example,
  seq(1)
 [1] 1
  seq(3)
 [1] 1 2 3

 Cheers,
 --
 -- Vivek Satsangi
 Rochester, NY USA

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Reorganizing rows and columns

2006-04-23 Thread Gabor Grothendieck
Use merge:

# test data
both - list(structure(list(Doc = c(5, 9, 7, 5, 7, 9), Query = c(1, 1,
1, 2, 2, 2), Rank = c(1, 2, 3, 1, 2, 3)), .Names = c(Doc, Query,
Rank), class = data.frame, row.names = c(1, 2, 3, 4,
5, 6)), structure(list(Doc = c(4, 5, 9, 8, 5, 7), Query = c(1,
1, 1, 2, 2, 2), Rank = c(1, 2, 3, 1, 2, 3)), .Names = c(Doc,
Query, Rank), class = data.frame, row.names = c(1, 2,
3, 4, 5, 6)))

merge(both[[1]], both[[2]], all = TRUE, by = 1:2)

On 4/23/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 I'm sure this is a simple task, but how to do it has escaped me.

 I have imported data from two separate files (each file contains the
 results from an information retrieval algorithm) organized into a list.
 They are organized by File,Query, and Rank (in that order):

 [[1]]
 Doc   Query   Rank
 5 1   1
 9 1   2
 7 1   3
 5 2   1
 7 2   2
 9 2   3

 [[2]]
 Doc   Query   Rank
 4 1   1
 5 1   2
 9 1   3
 8 2   1
 5 2   2
 7 2   3

 I need to rearrange the data so that it is sorted by Query and Document,
 with columns for rank1 and rank2 (from files 1 and 2, respectively). For
 example:

 [[1]]
 Doc   Query   Rank1   Rank1
 4 1   NA  1
 5 1   1   2
 7 1   3   NA
 9 1   2   3
 5 2   1   2
 7 2   2   3
 8 2   NA  1
 9 2   3   NA

 My goal is to perform a Spearman/Kendall test to check the correlation
 between the rankings.

 Any help would be appreciated.

 Andrew Noyes

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] arrange data for simple regression analysis

2006-04-24 Thread Gabor Grothendieck
Here are a couple of possibilities using the builtin iris data set.  Note
that although the coefficients come out the same, the degrees of
freedom, etc., would differ:

 n - rep(1:3, 50)
 lm(Petal.Length ~ Petal.Width, iris, weight = n)

Call:
lm(formula = Petal.Length ~ Petal.Width, data = iris, weights = n)

Coefficients:
(Intercept)  Petal.Width
  1.0572.262

 lm(Petal.Length ~ Petal.Width, iris[rep(1:nrow(iris), n),])

Call:
lm(formula = Petal.Length ~ Petal.Width, data = iris[rep(1:nrow(iris),
n), ])

Coefficients:
(Intercept)  Petal.Width
  1.0572.262

On 4/24/06, Tomás Revilla [EMAIL PROTECTED] wrote:
 Hello, I want to arrange data from a table to perform a simple
 regression. All the examples I saw deal with paired data, e.g. 'x' and
 'y' have the same dimensions (e.g. 5 values for x and 5 for y).

 But I have more than one 'y' for each 'x' value, e.g. the data file
 has a x = 0, 30, 60, and 120 columns. And for each of them I have
 several replicate responses (e.g. individuals), not allways the same
 number. After I read the data with read.table(), ending with 4
 columns, what is next? how can I regress this against c(0, 30, 60,
 120)?

 0   --   n1 y values
 30 --  n2 y values
 60 -- n3 y values
 120  -- n4 y values

 Thanks,

 Tomas

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Sending an ESC command to the console from wihtin a script

2006-04-24 Thread Gabor Grothendieck
RSiteSearch(clear screen)

will locate Windows code to send a ctrl-L to the screen that you
can modify.

On 4/24/06, Tolga Uzuner [EMAIL PROTECTED] wrote:
 Hi,

 Is there a way to send an ESC command to the console from within a
 script window, without using the mouse ?

 Thanks,
 Tolga

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Handling large dataset dataframe

2006-04-24 Thread Gabor Grothendieck
You just need the much smaller cross product matrix X'X and vector X'Y so you
can build those up as you read the data in in chunks.


On 4/24/06, Sachin J [EMAIL PROTECTED] wrote:
 Hi,

  I have a dataset consisting of 350,000 rows and 266 columns.  Out of 266 
 columns 250 are dummy variable columns. I am trying to read this data set 
 into R dataframe object but unable to do it due to memory size limitations 
 (object size created is too large to handle in R).  Is there a way to handle 
 such a large dataset in R.

  My PC has 1GB of RAM, and 55 GB harddisk space running windows XP.

  Any pointers would be of great help.

  TIA
  Sachin


 -

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Change the language of the labels in a graph

2006-04-24 Thread Gabor Grothendieck
This works for me on my Windows XP system:

Sys.putenv(LANGUAGE=FR); Sys.setlocale(LC_ALL,FR)


On 4/24/06, Lapointe, Pierre [EMAIL PROTECTED] wrote:
 Hello,

 How do you change the language of the labels in a graph.  In this example, I
 want to get French labels by changing Sys.putenv.  I should get Mai
 instead of May.

 Sys.putenv(LANGUAGE=fr)
 x - as.Date(c(1jan1960, 2jan1960, 31mar1960, 30jul1960), %d%b%Y)
 y -1:4
 plot(x,y)


 Regards,

 Pierre Lapointe


 **
 AVIS DE NON-RESPONSABILITE: Ce document transmis par courrie...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Handling large dataset dataframe [Broadcast]

2006-04-24 Thread Gabor Grothendieck
The other thing you could try after doing this is to sample
some rows from your data and see if the subset gives
nearly the same answer as the entire data set.

On 4/24/06, Liaw, Andy [EMAIL PROTECTED] wrote:
 Here's a skeletal example.  Embellish as needed:

 p - 5
 n - 300
 set.seed(1)
 dat - cbind(rnorm(n), matrix(runif(n * p), n, p))
 write.table(dat, file=c:/temp/big.txt, row=FALSE, col=FALSE)

 xtx - matrix(0, p + 1, p + 1)
 xty - numeric(p + 1)
 f - file(c:/temp/big.txt, open=r)
 for (i in 1:3) {
x - matrix(scan(f, nlines=100), 100, p + 1, byrow=TRUE)
xtx - xtx + crossprod(cbind(1, x[, -1]))
xty - xty + crossprod(cbind(1, x[, -1]), x[, 1])
 }
 close(f)
 solve(xtx, xty)
 coef(lm.fit(cbind(1, dat[,-1]), dat[,1]))  ## check result

 unlink(c:/temp/big.txt)  ## clean up.

 Andy

 -Original Message-
 From: Sachin J [mailto:[EMAIL PROTECTED]
 Sent: Monday, April 24, 2006 5:09 PM
 To: Liaw, Andy; R-help@stat.math.ethz.ch
 Subject: RE: [R] Handling large dataset  dataframe [Broadcast]


 Hi Andy:

 I searched through R-archive to find out how to handle large data set using
 readLines and other related R functions. I couldn't find any single post
 which elaborates the process. Can you provide me with an example or any
 pointers to the postings elaborating the process.

 Thanx in advance
 Sachin


 Liaw, Andy [EMAIL PROTECTED] wrote:

 Instead of reading the entire data in at once, you read a chunk at a time,
 and compute X'X and X'y on that chunk, and accumulate (i.e., add) them.
 There are examples in S Programming, taken from independent replies by the
 two authors to a post on S-news, if I remember correctly.

 Andy

 From: Sachin J
 
  Gabor:
 
  Can you elaborate more.
 
  Thanx
  Sachin
 
  Gabor Grothendieck wrote:
  You just need the much smaller cross product matrix X'X and
  vector X'Y so you can build those up as you read the data in
  in chunks.
 
 
  On 4/24/06, Sachin J wrote:
   Hi,
  
   I have a dataset consisting of 350,000 rows and 266 columns. Out of
   266 columns 250 are dummy variable columns. I am trying to
  read this
   data set into R dataframe object but unable to do it due to memory
   size limitations (object size created is too large to
  handle in R). Is
   there a way to handle such a large dataset in R.
  
   My PC has 1GB of RAM, and 55 GB harddisk space running windows XP.
  
   Any pointers would be of great help.
  
   TIA
   Sachin
  
  
   -
  
   [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
   http://www.R-project.org/posting-guide.html
  
 
 
 
  -
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 


 
 --
 Notice: This e-mail message, together with any attachments, ...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Add qoutation marks and combine values in a vector

2006-04-25 Thread Gabor Grothendieck
Use dQuote.  Assuming you have a data frame with the column
as factors:

DF - data.frame(x = letters)  # test data
levels(DF$x) - dQuote(levels(DF$x))


On 4/25/06, Jerry Pressnell [EMAIL PROTECTED] wrote:
 I wish to place quotation marks around each element of the following
 list;

 X1
 1  Label 1
 2  Label 2
 3  Label 3
 4  Label 4

 and combine the values in the following format for use in another
 function;

 c(Label 1,Label 2,Label 3,Label 4)

 Many thanks,

 Jerry

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R help

2006-04-25 Thread Gabor Grothendieck
A similar question was just asked. See:

http://tolstoy.newcastle.edu.au/R/help/06/04/25898.html

On 4/25/06, Erez [EMAIL PROTECTED] wrote:
 Hello,

 I'm working with large matrix data and i would like to know if
 there is any way to reduce the size of it because even that I'm
 increasing the memory limit and that i have 1 gb memory the
 program throwing me out.
 There is any way to use a smaller size data (such as using bits or so)
 to reduce the size of it.

 Erez

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] by() and CrossTable()

2006-04-25 Thread Gabor Grothendieck
At least for this case I think you could get the effect without modifyiing
CrossTable like this:

as.CrossTable - function(x) structure(x, class = c(CrossTable, class(x)))
print.CrossTable - function(x) for(L in x) cat(L, \n)

by(warpbreaks, warpbreaks$tension, function(x)
as.CrossTable(capture.output(CrossTable(x$wool, x$breaks  30,
format=SPSS, fisher=TRUE


On 4/25/06, Marc Schwartz (via MN) [EMAIL PROTECTED] wrote:
 On Tue, 2006-04-25 at 11:07 -0400, Chuck Cleland wrote:
 I am attempting to produce crosstabulations between two variables for
  subgroups defined by a third factor variable.  I'm using by() and
  CrossTable() in package gmodels.  I get the printing of the tables first
  and then a printing of each level of the INDICES.  For example:
 
  library(gmodels)
 
  by(warpbreaks, warpbreaks$tension, function(x){CrossTable(x$wool,
  x$breaks  30, format=SPSS, fisher=TRUE)})
 
 Is there a way to change this so that the CrossTable() output is
  labeled by the levels of the INDICES variable?  I think this has to do
  with how CrossTable returns output, because the following does what I want:
 
  by(warpbreaks, warpbreaks$tension, function(x){summary(lm(breaks ~ wool,
  data = x))})
 
  thanks,
 
  Chuck

 Chuck,

 Thanks for your e-mail.

 Without digging deeper, I suspect that the problem here is that
 CrossTable() has embedded formatted output within the body of the
 function using cat(), as opposed to a two step process of creating a
 results object, which then has a print method associated with it. This
 would be the case in the lm() example that you have as well as many
 other functions in R.

 I had not anticipated this particular use of CrossTable(), since it was
 really focused on creating nicely formatted 2d tables using fixed width
 fonts.

 That being said, I have had recent requests to enhance CrossTable()'s
 functionality to:

 1. Be able to assign the results of the internal processing to an object
 and be able to assign that object without any other output. For example:

  Results - CrossTable(...)

 yielding no further output in the console.


 2. Facilitate LaTeX markup of the CrossTable() formatted output for
 inclusion in LaTeX documents.


 Both of the above would require me to fundamentally alter CrossTable()
 to create a CrossTable class object, as opposed to the current
 embedded output. I would then create a print.CrossTable() method
 yielding the current output, as well as one to create LaTeX markup for
 that application. The LaTeX output would likely need to support the
 regular 'table' style as well as 'ctable' and 'longtable' styles, the
 latter given the potential for long multi-page output.

 These changes should then support the type of use that you are
 attempting here.

 These are on my TODO list for CrossTable() (along with the inclusion of
 the measures of association recently discussed) and now that the dust
 has settled from some recent abstract submission deadlines I can get
 back to some of these things. I don't have a timeline yet, but will
 forge ahead with these enhancements.

 One possible suggestion for you as an interim, at least in terms of some
 nicely formatted n-way tables is the ctab() function in the 'catspec'
 package by John Hendrickx.

 A possible example call would be:

 ctab(warpbreaks$tension, warpbreaks$wool, warpbreaks$breaks  30,
 type = c(n, row, column, total), addmargins = TRUE)


 Unlike CrossTable() which is strictly 2d (though that may change in the
 future), ctab() directly supports the creation of n-way tables, with
 counts and percentages/proportions interleaved in the output. There are
 no statistical tests applied and these would need to be done separately
 using by().


 Chuck, feel free to contact me offlist as other related issues may arise
 or as you have other comments on this.

 Again, thanks for the e-mail.

 Best regards,

 Marc Schwartz

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Questions to RDCOMClient

2006-04-25 Thread Gabor Grothendieck
On 4/25/06, Dr. Michael Wolf [EMAIL PROTECTED] wrote:

 3. RDCOMCLient and Excel Manual
 ===

 Do you know a good overview of using Excel VBA code via RDCOMClient (e. g.
 sh$Select())? Are there people interesting in working out such a paper? I
 could contribute some experiences of my work to such a project (e. g.
 deleting Excel shapes from R and copying new charts made by R to a special
 position in a Excel sheet.

Normally what I do is just create whatever spreadsheet I want in Excel
with the Excel macro recorder turned on and then look at the macro output
and translate that to RDCOMClient.  There do exist some books on
VBA programming in Excel (I don't have any myself but have taken one
out from the library once) that could be helpful if the macro approach is
not sufficient.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] www.r-project.org

2006-04-25 Thread Gabor Grothendieck
On 4/25/06, hadley wickham [EMAIL PROTECTED] wrote:
  The R Web site is working fine. Even if it is not relifted from a long
  time, it is functional. So, this is the point... and it should remain,
  at least, as functional as it is.

 As an experienced user of the R website, this probably is true for
 you.  However, there are a number of confusing problems for new users
 of the site:

  * how do you download R?
  * how do you bookmark a specific page?
  * what is that giant graphic on the home page?

* can't get to contributed docs directly from home page

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] www.r-project.org

2006-04-25 Thread Gabor Grothendieck
Its not hard if you know what to do but if you don't then its a nuisance
to figure it out every time.

On 4/25/06, Gavin Simpson [EMAIL PROTECTED] wrote:
 On Tue, 2006-04-25 at 14:09 -0400, Gabor Grothendieck wrote:
  On 4/25/06, hadley wickham [EMAIL PROTECTED] wrote:
The R Web site is working fine. Even if it is not relifted from a long
time, it is functional. So, this is the point... and it should remain,
at least, as functional as it is.
  
   As an experienced user of the R website, this probably is true for
   you.  However, there are a number of confusing problems for new users
   of the site:
  
* how do you download R?
* how do you bookmark a specific page?
* what is that giant graphic on the home page?
 
  * can't get to contributed docs directly from home page

 Isn't Other  Contributed Documentation sufficient? Usability guidelines
 for websites suggest that you should have as few top-level menu items as
 possible, say 5-6 max... OK the R website is not like
 insert_company_name.com type website, but you wouldn't want to flood
 users with too many options up front.

 G

 --
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 *  Note new Address, Telephone  Fax numbers from 6th April 2006  *
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson
 ECRC  ENSIS  [t] +44 (0)20 7679 0522
 UCL Department of Geography   [f] +44 (0)20 7679 0565
 Pearson Building  [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street  [w] http://www.ucl.ac.uk/~ucfagls/cv/
 London, UK.   [w] http://www.ucl.ac.uk/~ucfagls/
 WC1E 6BT.
 %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] www.r-project.org

2006-04-25 Thread Gabor Grothendieck
On Windows, right click the web page, choose Properties and
copy the url there.

On 4/25/06, Spencer Graves [EMAIL PROTECTED] wrote:
 see inline

 hadley wickham wrote:

 The R Web site is working fine. snip
 
  As an experienced user of the R website, this probably is true for
  you.  However, there are a number of confusing problems for new users
  of the site:
 
   * how do you download R?
   * how do you bookmark a specific page?

 *** If I find something with R Site Search on www.r-project.org, I
 can NOT just copy the web address into an email, because the address
 is still www.r-project.org.  However, if I use RSiteSearch from within
  R, I get an honest address (like
 http://finzi.psych.upenn.edu/R/Rhelp02a/archive/47417.html;), which I
 can then paste into an email like this.

  If it weren't too difficult to display the address for each item
 retrieved from the archives, it would make it easier it use R Site
 Search without opening R.

  Thanks to all the core R team, including Jonathan Baron, whose
 support of R Site Search has prevented me from tearing my hair out
 on many occasions (and I don't have much left to tear out).  When people
 ask me questions about S-Plus, I often go to R Site Search, and then
 see if I can somehow use in S-Plus any R solution I find.

  Best Wishes,
  spencer graves
 p.s.  I also have a strong preference for avoiding fancy features.  I've
 been burned so many times with viruses and software that never performed
 as advertized for many unknown reasons that I routinely check no when
 asked if I want to install Micromedia Flash, and I hope I won't have
 to install it to use a future version of www.r-project.org.

   * what is that giant graphic on the home page?
 
  Hadley
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] new.frame()

2006-04-26 Thread Gabor Grothendieck
Also, the proto and R.oo packages provide object oriented
ways of working with environments.

On 4/26/06, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 ?new.env
 ?local

 should help you.  R works with environments, basically a frame plus an
 enclosure.

 On Wed, 26 Apr 2006, Anna Whitfield wrote:

  Hello,
 
  I would like to know whether R has a homogeneous function
  of S-plus's new.frame(), which create explicit frames in
  the evaluator and provide a locale for computations that
  can be shared among various functions.
 
  new.frame() in S-plus:
  http://www.uni-muenster.de/ZIV/Mitarbeiter/BennoSueselbeck/s-html/helpfiles/new.frame.html
 
  Thanks.
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 

 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] environment

2006-04-26 Thread Gabor Grothendieck
A third possibility is to using the proto package and define
a proto object (an environment with special meaning for $)
containing the two components .x and g like this:

library(proto)
p - proto(.x = 2, g = function(.) { print(.$.x); .$.x - 3 })
p$.x  # 2
print(p$g())  # 2, 3
p$.x # 3

or you can write the print statement as with(., print(.x))

On 26 Apr 2006 11:02:58 +0200, Peter Dalgaard [EMAIL PROTECTED] wrote:
 Romain Francois [EMAIL PROTECTED] writes:

  Hi,
 
  Consider the code :
 
  g - function(){
print(.x)
   .x - 3
  }
 
  f - function(){
environment(g) - environment()
.x - 2
g()
.x
  }
 
f()
  [1] 2
  [1] 2
 
 
  I would like f() to return 3. How can I do that ? Am I completely out of
  place ?
  Doing that, I want to avoid to pass .x as a parameter in f, because in
  real life .x is pretty big and g() is called over and over in a loop.

 If you want to assign into the environment of g, you'll need - ,
 otherwise you  assign to a local variable.

 Another possibility involves assign(..., parent.frame())

And a third possibility is:

library(proto)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] environment

2006-04-26 Thread Gabor Grothendieck
Also, if you don;t need to create child objects which override .x,
and I don't think you do here, p could be further simplified to
this (only the print statement has been changed):

p - proto(.x = 2, g = function(.) { print(.x); .$.x - 3 })


On 4/26/06, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 A third possibility is to using the proto package and define
 a proto object (an environment with special meaning for $)
 containing the two components .x and g like this:

 library(proto)
 p - proto(.x = 2, g = function(.) { print(.$.x); .$.x - 3 })
 p$.x  # 2
 print(p$g())  # 2, 3
 p$.x # 3

 or you can write the print statement as with(., print(.x))

 On 26 Apr 2006 11:02:58 +0200, Peter Dalgaard [EMAIL PROTECTED] wrote:
  Romain Francois [EMAIL PROTECTED] writes:
 
   Hi,
  
   Consider the code :
  
   g - function(){
 print(.x)
.x - 3
   }
  
   f - function(){
 environment(g) - environment()
 .x - 2
 g()
 .x
   }
  
 f()
   [1] 2
   [1] 2
  
  
   I would like f() to return 3. How can I do that ? Am I completely out of
   place ?
   Doing that, I want to avoid to pass .x as a parameter in f, because in
   real life .x is pretty big and g() is called over and over in a loop.
 
  If you want to assign into the environment of g, you'll need - ,
  otherwise you  assign to a local variable.
 
  Another possibility involves assign(..., parent.frame())

 And a third possibility is:

 library(proto)


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] About regression and plot

2006-04-26 Thread Gabor Grothendieck
On 4/26/06, Daniel Yang [EMAIL PROTECTED] wrote:
 --- I changed to format to text instead of html ---

 Dear R-help,

 This is my first R day.  I want to ask some more beginner's questions.

Read the posting guide at the bottom of each post to r-help.


 Q1. How can I obtain the covariance matrix for parameter estimates of a
 multiple regression?  I checked ?lm but didn't get the information.


?vcov

 Q2. How can I see the old graphs in the graph window?

Assuming you are using Windows, create a plot, e.g. plot(1),
change focus to the plot, i.e. left click it, and choose Recording
from the History menu.  From that point on in the session (or
until you turn it off), it will record your plots and you can change
focus to the plot window and use PageUp and PageDn keys to
move through them.


 Q3. Can R plot animated graph?  For example, I want to see the dynamic
 change of a 2D graph during a time period.

RSiteSearch(animation)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Were to find appropriate functions for a given task in R

2006-04-26 Thread Gabor Grothendieck
There is a reference sheet here:
   http://www.rpad.org/Rpad/R-refcard.pdf
a function finder here:
   http://biostat.mc.vanderbilt.edu/s/finder/find.html
and task views here:
   http://cran.r-project.org/src/contrib/Views/

Also use of RSiteSearch and help.search from within R
can be helpful.

On 4/26/06, Albert Sorribas [EMAIL PROTECTED] wrote:
 This is a generic request concerning were to look for finding
 appropriate information on a precise procedure in R.
 I'm using R for teaching introductory statistics and my students are
 learning how to deal with it. However, I find it difficult to locate
 some of the procedures. For instance, for basic crosstabulation, it is
 obvious that basic functions as table, ftable, and prop.table can be
 used. But there is a CrossTable function that is very useful. This is
 hidden in gmodels and gregmisc, as far as I've been able to explore the
 packages. However, there is no way (unless I sit down to r-help for
 hours) to be sure if there is some other place in which a very useful
 function is hidden for table manipulation (for instance controlling for
 other variables). This is only an example. But there are many more. Were
 to look for CI for proportions? I can find it but it is not easy.

 I understand R is more appropriate for difficult statistical procedures
 (glm and similar), BUT students need to start somewhere….

 My specific claim is about the need for a sort of guide in which the
 different procedures could be classified (and some redundancies could be
 deleted…..by the way). Is there something similar around? Any project
 working on this? Any clue for?

 If not, I would suggest starting some kind of easy reference based on
 the problem to solve. This could indicate were to look for. Last day I
 find in package vcd that a function exist for testing the
 goodness-of-fit of a sample to binomial and other distributions….but
 this was VERY difficult to locate.

 Any way, as usual, any indication will be very useful (spaecially for my
 students!!!)



 Albert Sorribas
 Professor of Statistics and Operational Research
 Departament de Ciències Mèdiques Bàsiques
 Universitat de Lleida
 Montserrat Roig 2
 25008-Lleida (Espanya)
 web.udl.es/Biomath/Group



[[alternative HTML version deleted]]



 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] www.r-project.org

2006-04-26 Thread Gabor Grothendieck
Maybe a separate web site that shows R off or maybe just
a pointer to the R Graph Gallery.

On 4/26/06, Duncan Murdoch [EMAIL PROTECTED] wrote:
 Romain Francois wrote:
  Dear R users and developpers,
 
  My question is adressed to both of you, so I choose R-help to post it.
 
  Are there any plans to jazz up the main R website : http://www.r-project.org
  The look it have now is the same for a long time and kind of sad
  compared to other statistical package's website. Of course, the
  comparison is not fair, since companies are paying web designers to draw
  lollipop websites ...

 There have been various suggestions along these lines (check the
 archives), but there are a number of constraints that make the problem
 difficult:

  - there are two web sites, www.r-project.org and cran.r-project.org
 with different needs.  In particular, CRAN must be very low tech because
 it is mirrored on very diverse sites (including local copies, e.g. on a
 CDROM).

  - There are a lot of busy people who need to edit these pages
 occasionally, so a stable, standard, simple setup is extremely
 desirable.  That means simple HTML to be edited in a text editor, no
 special CMS.

 These requirements are quite hard to meet, so expect changes to the web
 sites to be very time consuming, and possibly rejected en masse in the end.

 Duncan Murdoch

 
  My first idea was to organize some kind of web designing contest.
  But, I had a small talk with Friedrich Leisch about that, who said that
  I shouldn't expect too many competitors.
  So, what about creating a small team, create a home page project and
  then propose it to the core team.
  It goes without saying it : The core team has the final word.
 
  What do you think ? Who would like to play ?
 
  Romain
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] help using tapply

2006-04-26 Thread Gabor Grothendieck
On 4/26/06, Dimitri Szerman [EMAIL PROTECTED] wrote:
 Dear R-mates,

 # Here's what I am trying to do. I have a dataset like this:

 id = c(rep(1,8), rep(2,8))
 dur1 - c( 17,18,19,18,24,19,24,24 )
 est1 - c( rep(1,5), rep(2,3) )
 dur2 - c(1,1,3,4,8,12,13,14)
 est2 - rep(1,8)

 mydata = data.frame(id,
estat=c(est1, est2),
durat=c(dur1, dur2))


 # I want to one have this:

 id = c(rep(1,8), rep(2,8))
 dur1 - c( 17,18,19,20,28,1,2,3 )
 est1 - c( rep(1,5), rep(2,3) )
 dur2 - c(1,2,3,4,12,13,14,15)
 est2 - rep(1,8)

 mydata2 = = data.frame(id,
estat=c(est1, est2),
durat=c(dur1, dur2))


 # What is happening here? I have a longitudinal dataset.
 # Individuals are observed 8 times, and each time each of them are in a
 certain state J (here, J={1,2}).
 # Each observation is one unit of time away from the following one, except
 observations 4 and 5, which are 8 units of time away from each other.
 # So here we have individual 1 migrating from state 1 to state 2 at
 observation #6,
 # while individual 2 stays in state 1 as long as we can observe her.
 # I am interested in the spell (duration) of each state.
 # However, the durations are clearly mismesuared, and now I am trying to
 give some consistency to the data.
 # I am assuming that the first duration is correct. Departing from this, I
 wrote the following function:

d - function(dur,est)
 {
if ( sum( diff(est) )==0 ) # for those who didn't change state
{
for( i in c(2:4))
dur[i] - dur[i-1] + 1

dur[5] - dur[4] + 8

for( i in c(6:8) )
dur[i] - dur[i-1] + 1
}
if ( sum( diff(est) )!=0 ) # for those who changed state
{
 j = which(diff(est)!=0) + 1# j is when the change occured
 dur[j] = 1

 k0  = which( c(1:8)  j )[-c(1)]
 k1  = which( c(1:8)  j )
 if(length(j)  1)
{
for( i in 1:(length(j)-1) )
k2 = c(1:8)[c(1:8) j[i]  c(1:8) j[i+1]]
k = unique( c(k0,k1,k2) )
}
 k = unique( c(k0,k1) )
 k = k[!k%in%j]
 if(5%in%k)
{
 k = k[k != 5]
 for(i in k[k5])
dur[i] = dur[i-1] + 1

 dur[5] = dur[4] + 8

 for(i in k[k5])
dur[i] = dur[i-1] + 1
} else
  {
for(i in k)
dur[i] = dur[i-1] + 1
  }
}
 dur

 }

 # Now, if a do

d(dur1, est1)
 # and
d(dur2,est2)
 # I get what I want, except from the fact that I couldn't do this for a
 large dataset.
 # So I decide to use tapply. But this gives me

new.durat - tapply(mydata$durat, IND=mydata$id, FUN=d,
 est=mydata$estat)
mydata$new.durat - unlist(new.durat)

 mydata
id   estat durat new.durat
 1   1 11717
 2   1 11818
 3   1 11919
 4   1 11820
 5   1 12428
 6   1 21929
 7   1 22430
 8   1 22431
 9   2 1 1 1
 10  2 1 1 2
 11  2 1 3 3
 12  2 1 4 4
 13  2 1 812
 14  2 11213
 15  2 11314
 16  2 11415

 # what is not what I want. I can't figure it out why, but when I use tapply,
 # the logical expression sum( diff(est) )==0 turns out to be true for both
 individuals
 # (whereas we know this is true only for individual #2).
 # I am sorry for the long message. I will be very grateful for any help with
 this problem.

I didn't try to read all this carefully but I think you want to tapply
over the indices so you can use them in both columns:

with(mydata,
  unlist(tapply(seq(id), id, function(i) d(durat[i], estat[i])))
)

or use by:

unlist(by(mydata, mydata$id, function(x) d(x$durat, x$estat)))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] copying previously installed libraries to R 2.3.0

2006-04-26 Thread Gabor Grothendieck
Note that the FAQ is really no different from batchfiles since the
FAQ does not address how to move or copy the packages and the batchfiles
scripts just do that (with the safeguard that they will not overwrite any
packages that are already there so you could, for example, manually
install a few packages and the move or copy the remaining ones over.)

I think there was some discussion that 2.3.0 might have moving/copying
capability built into the R installer but since it seems that
that did not make it into 2.3.0 it should be possible to use
the scripts, as before, or, alternatively, just re-install all your
packages from scratch.

By the way, there were some comments about the advantage of
keeping your packages in a library so that you can just update
the library.  The problem with that is that if you want to keep
multiple versions of R on the same system then you will want
to make sure that each R version has packages that run with that
version of R so if you clobber the ones that run with 2.2.0, say, by
overwriting them with 2.3.0 packages then you can no longer use
that library with 2.2.0.  If you keep the packages in .../R/R-2/library
then you can be sure that each R version has the right packages in
its library without messing around.

On 4/26/06, Thomas Harte [EMAIL PROTECTED] wrote:
 the windoze faq that you refer to doesn't quite address the question that i 
 asked, but thanks all the same.

 2.8 What's the best way to upgrade?  That's a matter of taste.  For most 
 people the best thing to do is to uninstall R (see the previous Q), install 
 the new version, copy any installed packages to the library folder in the new 
 installation, run update.packages() in the new R (`Update packages...' from 
 the Packages menu, if you prefer) and then delete anything left of the old 
 installation.  Different versions of R are quite deliberately installed in 
 parallel folders so you can keep old versions around if you wish.
 Upgrading from R 1.x.y to R 2.x.y is special as all the packages need to be 
 reinstalled.  Rather than copy them across, make a note of their names and 
 re-install them from CRAN.


 Christos Hatzis [EMAIL PROTECTED] wrote: See Windows FAQ 2.8 - works well.

 -Christos


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Thomas Harte
 Sent: Wednesday, April 26, 2006 2:54 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] copying previously installed libraries to R 2.3.0

 hi all,

 is there a new mechanism in R 2.3.0 for copying libraries from, say, R 2.2.1
 to R 2.3.0? i ask because gabor grothendieck comments in his copydir.bat
 (from gabor's batchfiles at:
 http://cran.r-project.org/contrib/extra/batchfiles/batchfiles_0.2-5.zip ):

 ``::   I personally upgraded my 2.1.0 to 2.2.0 this way so it seems ok until
  ::   R replaces this with something better which is expected for 2.3.0.
 '''

 see also the posting below.

 cheers,

 thomas.


 [R] copy contributed packages from R 2.2.0 to 2.2.1
This message: [ Message body ]  [ More options ]
Related messages:  [ Next message ] [ Previous message ] [ In reply to ]
 [ Next in thread ]

 From: Ronnie Babigumira
 Date: Fri, 23 Dec 2005 15:58:36 +0100
Hi Helli, this came up last week, Here are some of the replys posted

 1.
  In http://cran.r-project.org/contrib/extra/batchfiles/batchfiles_0.2-5.zip

 are two Windows XP batch files:

 movedir.bat
  copydir.bat

 which will move the packages (which is much faster and suitable if you don't
 need the old version of R any more) or copy  the packages (which takes
 longer but preserves the old version).

 2.
  x - installed.packages()[,1]
  install.packages(x)

 3.
  This is one reason we normally recommend that you install into a separate
 library.  Then update.packages(checkBuilt =
  TRUE) is all that is needed. However,

 foo - installed.packages()
  as.vector(foo[is.na(foo[, Priority]), 1])

 will give you a character vector which you can feed to install.packages(),
 so it's not complex to do manually.

 4.
  If the previous installation is still alive, fire it up and

 pS - packageStatus()
  pkgs - pS$inst$Package[!pS$inst$Priority %in% c(base, recommended)]
 save(pkgs, file = foo)

 In the new installation,

 load(foo)
  install.packages(pkgs)

 Helmut Kudrnovsky wrote:
  hi R-users,
 
  a few days ago R 2.2.1 came out. on my win xp i'installed R 2.2.0. along
 the time i've installed a lot of contributed packages. my
 internet-connection is not very fast.
 
  so my question: is it possible after installing R 2.2.1 to do copy/paste
 the contributed packages from the C:\Programme\R221 to the
 C:\Programme\R2.2.1- location in the files system?
 
  or have i to download and install the packages new?
 
 
  greetings from the snowy austria
  merry christmas
  helli
 
  system
  R.2.2.0
  win xp
 
  __
  R-help at stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE

Re: [R] problem with get command

2006-04-26 Thread Gabor Grothendieck
ov$vn1 is not a variable.  It is the result of applying the $ function to
the ov and vn1 arguments.

For example, using BOD which is a data frame that comes with R,
rather than get(BOD$Time) use get(BOD)[[Time]]

On 4/26/06, Thomas Davidoff [EMAIL PROTECTED] wrote:
 I don't understand what my error is in the following:
 I need to use the get command on a series of variables, but can't for
 some reason that I don't understand.  Why am I told no such variable as
 ov$vn1 after getting a summary report on that very variable?
  summary(ov$vn1)
 Min.  1st Qu.   Median Mean  3rd Qu. Max. NA's
  1.0 25.0 81.0468.1450.0 159100.0   6050.0
  dvars - paste(ov$dvn, 1:4, sep=)
  vars - c(ov$vn1,ov$vn2,ov$vn3,ov$vn4)
  summary(get(vars[1]))
 Error in get(x, envir, mode, inherits) : variable ov$vn1 was not found
 Execution halted



 Thomas Davidoff
 Assistant Professor
 Haas School of Business
 UC Berkeley
 Berkeley, CA 94618
 Phone:(510) 643-1425
 Fax:(510) 643-7357
 email:[EMAIL PROTECTED]
 web:http://faculty.haas.berkeley.edu/davidoff/

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] stl function

2006-04-26 Thread Gabor Grothendieck
stl does this internally:

x - na.action(as.ts(x))

so

stl(x, s.window, na.action = f)

is the same as

stl(f(as.ts(x)), s.window)

e.g.

nottem[25] - NA  # nottem is a built in data set in R
stl(nottem, per) # error
stl(nottem, per, na.action = na.contiguous)
library(zoo)
stl(nottem, per, na.action = na.locf)
stl(nottem, per, na.action = na.approx)

Whether any of these makes sense is another matter.

On 4/26/06, Andrea Toreti [EMAIL PROTECTED] wrote:
 Hi,
 I have a monthly time series with missing values and I would use stl function 
 to identify seasonality.
 I tried all settings of na.action but the result is the same:

  stl(tm245,s.window=11, na.action=na.pass)
 Error in stl(tm245, s.window = 11, na.action = na.pass) :
NA/NaN/Inf in foreign function call (arg 1)

 Can you help me?

 Thanks

 Andrea Toreti
[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] gmane?

2006-04-27 Thread Gabor Grothendieck
They can be found here:

http://dir.gmane.org/gmane.comp.lang.r.announce
Announcements about the development of the R Project for Statistical
Computing and the availability of new code. (read-only)

http://dir.gmane.org/gmane.comp.lang.r.deal
Learning Bayesian networks in R - the 'deal' package

http://dir.gmane.org/gmane.comp.lang.r.debian
Discussion of the Debian port of the statistical software GNU R

http://dir.gmane.org/gmane.comp.lang.r.devel
R language developers list

http://dir.gmane.org/gmane.comp.lang.r.general
The `main' R mailing list, a language and environment for statistical
computing and graphics.

http://dir.gmane.org/gmane.comp.lang.r.geo
Discussion of geographical data in the statistical software GNU R

http://dir.gmane.org/gmane.comp.lang.r.gr
R Special Interest Group on gRaphical models

http://dir.gmane.org/gmane.comp.lang.r.gui
Discussion of the Graphical User Interface for the statistical software GNU R

http://dir.gmane.org/gmane.comp.lang.r.mac
R Special Interest Group on Macintosh Development and Porting, both
for MacOS 8.6 - 9.x and MacOS X

http://dir.gmane.org/gmane.comp.lang.r.r-metrics
Mailing list for discussions relating to use of GNU R in 'finance',
i.e. financial engineering, financial economics, empirical finance,
computational finance, ...




On 4/27/06, Jose Quesada [EMAIL PROTECTED] wrote:
 Hi All,

 I recently found gmane

 http://gmane.org/

 It's a system to covert mail to news and back, with the nice property of
 keeping
 a searchable archive... Very convenient if you are subscribed to many lists
 and
 don't want to have your mail box cluttered. I use it to read several mailing
 lists already, but R is not available there.

 I wonder if the admins know about gmane and if they think it'd be a good
 idea to
 have R-help added there. Quoting from their site:

 To get a new mailing list added, use the subscription form. Almost any
 mailing
 list can be added. Just include subscription information. Mailing list
 archives
 can be imported into Gmane.

 What do you think?
 --
 Cheers,
 -Jose
 --
 Jose Quesada, PhD.

 [EMAIL PROTECTED] Dept. of Psychology
 http://www.andrew.cmu.edu/~jquesada Sussex University
 Brighton, UK

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] gmane?

2006-04-27 Thread Gabor Grothendieck
Assuming you are aware that the dir pages at gmane provide
the key links for each group, googling for:

   dir gmane r-help

gets it as the first hit.

On 4/27/06, Jose Quesada [EMAIL PROTECTED] wrote:
 Thanks all,

 It was very surprising that I couldn't find it. I searched
 news.gmane.orgfor r-help, and nothing popped up.

 -Jose


 On 4/27/06, Sundar Dorai-Raj [EMAIL PROTECTED] wrote:
 
 
  Jose Quesada wrote:
   Hi All,
  
   I recently found gmane
  
   http://gmane.org/
  
   It's a system to covert mail to news and back, with the nice property of
   keeping
   a searchable archive... Very convenient if you are subscribed to many
  lists
   and
   don't want to have your mail box cluttered. I use it to read several
  mailing
   lists already, but R is not available there.
  
   I wonder if the admins know about gmane and if they think it'd be a good
   idea to
   have R-help added there. Quoting from their site:
  
   To get a new mailing list added, use the subscription form. Almost any
   mailing
   list can be added. Just include subscription information. Mailing list
   archives
   can be imported into Gmane.
  
   What do you think?
   --
   Cheers,
   -Jose
   --
   Jose Quesada, PhD.
  
   [EMAIL PROTECTED] Dept. of Psychology
   http://www.andrew.cmu.edu/~jquesada Sussex University
   Brighton, UK
  
 [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
  R-help is already there.
 
  Go to http://www.r-project.org/ and click the Search link. There's a
  link to Gmane there. The relevant group for R-help is called
  gmane.comp.lang.r.general.
 
  I agree that Gmane is very useful.
 
  --sundar
 



 --
 Cheers,
 -Jose
 --
 Jose Quesada, PhD.

 [EMAIL PROTECTED] Dept. of Psychology
 http://www.andrew.cmu.edu/~jquesada Sussex University
 Brighton, UK

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] State space AR models in R: some examples

2006-04-27 Thread Gabor Grothendieck
Check out the sspir package and
http://www.jstatsoft.org/index.php?vol=16


On 4/27/06, Pablo Almaraz [EMAIL PROTECTED] wrote:
 Hi all,

 Does anyone have an example of an autoregressive (AR) time-series model
 specified as a state space model in R? That is, I want to go beyond the
 locally linear (constant) model, and fit the following Gaussian AR state
 process model:

 Xt = a + (1+b)*Xt-1 + epsilon

 ,where the model for the observation process is

 Yt = Xt + tau

 I have information of the tau's (observation variance) for each
 observation in the time-series, and it would be perfect to include this
 information during the fitting routine. I have actually coded this as a
 WinBUGS code (pasted below), but I'm not quite sure it works as it
 should. I would be extremely thanked if anyone could submit an example
 of an R code fitting the above problem. Gibb's sampler for solving the
 problem would be great, I'm not sure whether the Kalman filter would
 work well with only 30 data points (?). Additional details, corrections
 and/or help would probably save my life at least for a while.

 Thank you all
 Cheers

 Pablo

 WinBUGS state-space R code:
 ##
 model;
{

 # Parameters and priors
alpha ~ dnorm(0,0.01)   # Intrinsic rate of increase
b ~ dnorm(0,0.01)
beta[1] - b-1   # First-order density-dependence
sigma ~ dunif(0, 1000)   # State process SD
isigma2 - pow(sigma, -2)   # State process 1/var
# isigma2 ~ dgamma(0.01,0.01)

 # Initial state value
n.exp[1] ~ dnorm(n[1],tau[1])

 # State process model
for(j in 1:(N-1)){
n.exp.mu[j+1] - alpha + b*n.exp[j]  # First-order Gompertz
 model
n.exp[j+1] ~ dnorm(n.exp.mu[j+1], isigma2)
}

 # Observation process model
for(j in 1:(N-1)){
n[j+1] ~ dnorm(n.exp[j+1],tau[j+1])
}
}

 # Loge-transformed and standardized time-series data

 list(N=28,
 n=c(-0.24645, 0.015312, 0.442262, -0.05879, -0.17308, -0.03778,
 0.120961, -0.04383, 0.002507, 0.073278, -0.11684, 0.003657, -0.07375,
 0.05006, -0.04489, -0.00826, -0.06713, 0.682228, 0.032058, -0.33254,
 -0.50432, 0.176914, 0.249793, 0.01672, -0.30581, -0.19617, 0.158579,
 0.185296),
 tau=c(2.38351, 2.351379, 49.12811, 10.01703, 11.68982, 3.846619,
 1.999254, 1.6685, 3.011932, 5.661051, 168.2524, 1.581, 25.74985,
 50.29332, 3.03117, 7.65013, 3.376606, 17.34871, 4.215985, 2.455294,
 7.685724, 1.918054, 5.588953, 8.503541, 0.5666, 0.923611, 4.986243,
 10.36613))

 # Inits for MC 1
 list(alpha = 0.5, b = -1, sigma = 0.5)

 # Inits for MC 2
 list(alpha = 1, b = 0.01, sigma = 1)

 # Inits for MC 3
 list(alpha = 0.01, b = 1, sigma = 0.01)

 ### End (not run)

 --

 Pablo Almaraz García
 Estación Biológica de Doñana (CSIC)
 Pabellón del Perú, Avda. Mª Luísa s/n
 E-41013, Sevilla
 SPAIN

 E-mail: almaraz[AT]ebd[DOT]csic[DOT]es
 webpage: http://www.almaraz.org

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] scope of variable/object ?

2006-04-27 Thread Gabor Grothendieck
You probably need to contact the developer of pamr but
short of investigating it, a workaround might be to put a copy
of myd2 into the global environment since it likely will at
least look there, e.g. add this line to:

   assign(myd2, myd2, .GlobalEnv)

domat.

On 4/27/06, Tim Smith [EMAIL PROTECTED] wrote:
 Hi,

  I must be missing something here...Essentially, a short piece of code works 
 if it's standalone, but doesn't work if it's divided into two functions.

  The code that works is:

   ### WORKS ###

 library(pamr)

 set.seed(120)
 x - matrix(rnorm(1000*20),ncol=20)
 y - sample(c(1:4),size=20,replace=TRUE)
 mydata - list(x=x,y=y)
 mytrain -   pamr.train(mydata)
 new.scales - pamr.adaptthresh(mytrain,ntries = 1)

  

  But if I split the lines into two functions, then I get an error message 
 that reads :
  'Error in pamr.train(data = myd2, threshold = threshold, threshold.scale = 
 all.scales[i+ : object myd2 not found.'

  The code that doesn't work is:


 ### DOESN'T WORK 
 library(pamr)

 domat - function(myd){
  myd2 - myd
mytrain -   pamr.train(myd2)
new.scales - pamr.adaptthresh(mytrain)
  }
  dom - function(){
 set.seed(120)
 x - matrix(rnorm(1000*20),ncol=20)
 y - sample(c(1:4),size=20,replace=TRUE)
 myda - list(x=x,y=y)
 domat(myda)
 }
  dom()

  #

  Did I do something really goofy? How can I find out what's happening?

  many thanks.




 -

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Break into Parts

2006-04-28 Thread Gabor Grothendieck
Try this:

tapply(x, cut(x, 12), sd)


On 4/28/06, sumanta basak [EMAIL PROTECTED] wrote:
 Hi R-Experts,

 I have a vector of length 72. I want to break it into 12 parts and want to 
 take standerd deviation of each group. Please help me in this regard.

 Thanks,
 Sumanta.


 -

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Break into Parts

2006-04-28 Thread Gabor Grothendieck
Good point.

Following Andy's comment sd(matrix(sort(x), nc=12))
could also be used if you want them broken up
by 6 smallest, next 6 smallest, etc. although
there might be differences in the case of ties.

Using tapply here are a number of ways of breaking
it up (the first three give the same answer as
sd(matrix(x,nc=12))) while the others form the
groups in different ways:

tapply(x, gl(12, 6), sd)
tapply(x, rep(1:12, each = 6), sd)
tapply(sort(x), gl(12, 6), sd)

tapply(x, rep(1:12, 6), sd)
tapply(sort(x), rep(1:6, 12), sd)
tapply(sort(x), rep(1:6, each = 12), sd)


On 4/28/06, Liaw, Andy [EMAIL PROTECTED] wrote:
 You didn't say _how_ you want the vector to be broken up,
 so you get two different answers from Uwe and Gabor.  Uwe's
 answer group every six elements into one group, in the order
 they appear in the vector (which, BTW, can be simplified to
 just sd(matrix(x, ncol=12)).  Gabor's answer put the smallest
 six into one group, the next smallest six into the second
 group and so on.  You'll have to decide which is the one
 you want.

 Andy

 From: sumanta basak
 
  Hi R-Experts,
 
  I have a vector of length 72. I want to break it into 12
  parts and want to take standerd deviation of each group.
  Please help me in this regard.
 
  Thanks,
  Sumanta.
 
 
  -
 
[[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] plot acf of several timeseries

2006-04-28 Thread Gabor Grothendieck
Try this:

lapply(names(tslist), function(nm) acf(tslist[[nm]], main = nm))


On 4/28/06, Ulf Mehlig [EMAIL PROTECTED] wrote:
 Hello r-help,

 I have a couple of time-series of different length and I would like to
 produce a simple overview plot showing the autocorrelation functions of
 the series. The time-series are stored in a dataframe like this:

 test.data
  item year value
1  xxx 1961 -1.09
2  xxx 1962  0.21
3  xxx 1963 -0.81
 [trimmed]
8  yyy 1959  1.12
9  yyy 1960  1.44
10 yyy 1961 -1.97
 [trimmed]

 I transformed them to a list of ts-objects and did the plotting via
 lapply():

 tslist - by(test.data, test.data$item,
   function(x) ts(x$value, start=min(x$year), 
 end=max(x$year)) )
 par(mfcol=c(length(tslist), 1))
 lapply(tslist, acf)

 Is there a possibility to adapt the procedure so that the name of
 item ('xxx', 'yyy', ...) is printed as title of each acf plot? I am
 sure that there are better ways to produce this type of plot ... do you
 have suggestions?

 Many thanks, Ulf

 --
  Ulf Mehlig[EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] copying previously installed libraries to R 2.3.0

2006-04-28 Thread Gabor Grothendieck
A corrected version is now in batchfiles_0.2-8.zip in:

   http://cran.r-project.org/contrib/extra/batchfiles/

and will propogate to the mirrors shortly.


On 4/28/06, Xiaohua Dai [EMAIL PROTECTED] wrote:
 copydir.bat wont work for libraries such as clim.pact, haplo.stats,
 hier.part, pls.pcr, R.matlab, R.oo. It will truncate new directories as
 clim, haplo, hier, pls, R.

 On 4/26/06, Thomas Harte [EMAIL PROTECTED] wrote:
 
  hi all,
 
  is there a new mechanism in R 2.3.0 for copying libraries from, say, R
  2.2.1 to R 2.3.0? i ask because gabor grothendieck comments in his
  copydir.bat (from gabor's batchfiles at:
  http://cran.r-project.org/contrib/extra/batchfiles/batchfiles_0.2-5.zip ):
 
  ``::   I personally upgraded my 2.1.0 to 2.2.0 this way so it seems ok
  until
  ::   R replaces this with something better which is expected for 2.3.0.
  '''
 
  see also the posting below.
 
  cheers,
 
  thomas.
 
 
  [R] copy contributed packages from R 2.2.0 to 2.2.1
 This message: [ Message body ]  [ More options ]
 Related messages:  [ Next message ] [ Previous message ] [ In reply to
  ]  [ Next in thread ]
 
  From: Ronnie Babigumira rb.glists
  Date: Fri, 23 Dec 2005 15:58:36 +0100
 Hi Helli, this came up last week, Here are some of the replys posted
 
  1.
  In http://cran.r-project.org/contrib/extra/batchfiles/batchfiles_0.2-5.zip
 
  are two Windows XP batch files:
 
  movedir.bat
  copydir.bat
 
  which will move the packages (which is much faster and suitable if you
  don't need the old version of R any more) or copy
  the packages (which takes longer but preserves the old version).
 
  2.
  x - installed.packages()[,1]
  install.packages(x)
 
  3.
  This is one reason we normally recommend that you install into a separate
  library.  Then update.packages(checkBuilt =
  TRUE) is all that is needed. However,
 
  foo - installed.packages()
  as.vector(foo[is.na(foo[, Priority]), 1])
 
  will give you a character vector which you can feed to install.packages(),
  so it's not complex to do manually.
 
  4.
  If the previous installation is still alive, fire it up and
 
  pS - packageStatus()
  pkgs - pS$inst$Package[!pS$inst$Priority %in% c(base, recommended)]
  save(pkgs, file = foo)
 
  In the new installation,
 
  load(foo)
  install.packages(pkgs)
 
  Helmut Kudrnovsky wrote:
   hi R-users,
  
   a few days ago R 2.2.1 came out. on my win xp i'installed R 2.2.0. along
  the time i've installed a lot of contributed packages. my
  internet-connection is not very fast.
  
   so my question: is it possible after installing R 2.2.1 to do copy/paste
  the contributed packages from the C:\Programme\R221 to the
  C:\Programme\R2.2.1- location in the files system?
  
   or have i to download and install the packages new?
  
  
   greetings from the snowy austria
   merry christmas
   helli
  
   system
   R.2.2.0
   win xp
  
   __
   R-help at stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
  
  Received on Fri Dec 23 2005 - 15:58:36 EST
 
 
This message: [ Message body ]
Next message: Matthias Kohl: [R] convolution of the double exponential
  distribution
Previous message: Helmut Kudrnovsky: [R] copy contributed packages from
  R 2.2.0 to 2.2.1
In reply to: Helmut Kudrnovsky: [R] copy contributed packages from R
  2.2.0 to 2.2.1
Next in thread: Uwe Ligges: [R] copy contributed packages from R 2.2.0to
  2.2.1
 
Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [
  by author ] [ by messages with attachments ]
 
This archive was generated by hypermail 2.2.0  : Sat Dec 31 2005 -
  19:09:32 EST
 
 [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
 

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] aggregating columns in a data frame in different ways

2006-04-28 Thread Gabor Grothendieck
Here are three possibilities:

1. aggregate on the columns that you want to sum and aggregate on
the columns that you want to average and then merge them:

By - A[, 2, drop = FALSE]
merge(aggregate(A[, 3, drop = FALSE], By, sum),
 aggregate(A[, 4, drop = FALSE], By, mean))

2. use by:

f - function(x) with(x, c(count = sum(count), value = mean(value)))
do.call(rbind, by(A[, 3:4], A[, 2, drop = FALSE], f))

3. use summaryBy in the doBy package picking off the appropriate
columns in the output:

library(doBy)
summaryBy(. ~ type, A[, -1], FUN = c(sum, mean))[, c(1, 2, 5)]


On 4/28/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 I would like to use aggregate() to combine statistics
 for several days in a data frame. My data frame looks
 similar to this:

   datetype  count  value
 1  2006-04-01 A 10   99.6
 2  2006-04-01 B  4   33.2
 3  2006-04-02 A 22   43.2
 4  2006-04-02 B  8   44.9
 5  2006-04-03 A 12   12.4
 6  2006-04-03 B 14   18.5

 ('date' is a factor, and my actual data frame has
 about 100 different 'types', not just two)

 I would like to sum up the 'counts' per 'type', and
 get an average of the 'values' per 'type'. In other
 words, I would like my results to look like this:

   type  count  value
 1  A 44 51.7
 2  B 26 32.2

 The way I'm doing this now is to tear the table apart
 into its individual columns, then apply aggregate() to
 each column individually (using the 'type' column for
 the 'by' parameter), and finally putting everything
 back together, like this:

  A.count = aggregate(A$count, list(type=A$type), sum)
  A.value = aggregate(A$value, list(type=A$type),
 mean)
  B = data.frame(type=A.count$type, count=A.count$x,
 value=A.value$x)

 My actual table is a bit more involved than in this
 simple example, however, so this becomes quite
 tedious.

 I am hoping that there is a simpler way for doing
 this, for example by providing different FUN
 parameters for each column to the aggregate()
 function.

 I would appreciate any suggestions.
 Thanks
 Klaus

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to rep a matrix by row?

2006-04-29 Thread Gabor Grothendieck
Try this where DF is your data frame:

DF[rep(seq(nrow(DF)), each = 3), ]

On 4/29/06, Jiantao Shi [EMAIL PROTECTED] wrote:
 Hi,
 i have a dataframe like this,

 SourceTreatDrugReplicate
  control0A1
  control10A2
  control30A3
  10A1


 And i want to rep this  dataframe  3 times by row,the resulting  matrix  as
 follow,

 SourceTreatDrugReplicate
  control0A1
  control0A1
  control0A1
  control10A2
  control10A2
  control10A2
  control30A3
  control30A3
  control30A3
  10A1
  10A1
  10A1


 So is there a easy way to do ?
 thanks.

 Jianao Shi

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Reshaping genetic data from long to wide

2006-04-29 Thread Gabor Grothendieck
You might want to check the double check the list archives

https://www.stat.math.ethz.ch/pipermail/r-help/

to see if your posts got through or not just in case its just
some problem in displaying your own posts.

On 4/29/06, Farrel Buchinsky [EMAIL PROTECTED] wrote:
 Gabor Grothendieck ggrothendieck at gmail.com writes:

  http://news.gmane.org/gmane.comp.lang.r.general
  or one of these:
  http://dir.gmane.org/gmane.comp.lang.r.general
 

 Yes but when I hit Post this article it send something to gMane (I think)
 but not to R-help@stat.math.ethz.ch

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Package docs for CRAN

2006-04-30 Thread Gabor Grothendieck
If your package is called mypkg you could create a mypkg-package.Rd
file.  e.g.

library(dyn)
library(help = dyn) # note that mypkg-package is listed
package?dyn
?dyn-package   # same

and you could add one or more vignettes, e.g.

library(zoo)
library(help = zoo) # note that the 2 vignettes are listed at end
vignette(zoo)


On 4/30/06, William Asquith [EMAIL PROTECTED] wrote:
 CRAN et al.,

 I would like to add an extented introduction or other arbitrary
 sections to my package lmomco.
 I have been shipping inst/doc/Introduction.Rd. I would like to have
 this content inserted to the front of the PDF build for the CRAN. The
 R-exts.pdf seems to be a little silent on this subject? For my
 purposes, I have been doing this

 R CMD Rd2dvi --pdf --title=lmomco---version X inst/doc/
 Introduction.Rd man/*.Rd

 but I don't get the correct header (description) or the index built
 as seen in the lmomco.pdf from the CRAN.

 Further, is there any point in shipping a complete PDF build of the
 docs as in inst/doc/lmomco.pdf?

 Please advise on best practices for building the best docs that I
 can. . .

 William

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Duration labels on plot axes

2006-04-30 Thread Gabor Grothendieck
The times class in chron will give hours and minutes. e.g.

library(times)
plot(times(0:23/23), 0:23)

and you could modify chron:::axis.times for the others.

On 4/30/06, Duncan Murdoch [EMAIL PROTECTED] wrote:
 I have a variable containing a measurement of a duration in seconds
 which I would like to use in a plot, with the axes labelled in a format
 like %H:%M:%S (or possibly %Hh%Mm) if the duration is more than an hour,
 or just %M:%S if more than a minute, or just decimal seconds if short
 enough.  Is there a class with an axis.* method defined that has
 behaviour something like this?

 Duncan Murdoch

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] general help on R and factor in R and a few simple comment from a newbie

2006-04-30 Thread Gabor Grothendieck
On 4/30/06, Guojun Zhu [EMAIL PROTECTED] wrote:
 Hi.  I am starting to learn R for a course project.  I
 am  relative OK c++ programer.  I found the R is very
 different.  I have read the an introduction to R.  I
 have to say it is not very newbie friendly.  It does
 not explain many things clearly.  And unfortunately,
 there is not too much introductory materials available
 on-line.  I do not want to buy a book.

Enter
  R
into google and you get the R home page.  On the left pane of that
under Documentation click on Other and from there click on
Contributed Documentation and there is a list of literally dozens
of different introductions.

Also google for
  zoonekynd R
for another online intro to R.


 For example, I found factor is a quite different
 concept.I cannot use it as a vector which I can
 somehow think as a 1-dimension array.  help(factor)
 does not help much to clear about the concept either.
 Also there are quite few basic concepts like the data
 structure of model, etc is far from clear for me.  Yet
 there is no general place I can look for there more
 general idea.

 help is a very interesting and useful function.
 However, I would say the content lacks some general
 idea.  I used to learn Mathematica, which is also a
 high-level tool by their help.  It is very
 comprehensive, yet well-organized with some general
 idea, some specific fundtion explanation and some
 functions about one topic.  For R's help, you get only
 the specific explanation for the perticular function,
 and no more related things.  I feel it is more like a
 reference for experienced user instead of some newbie.



 I know there should be some trick by R with some dense
 code for big work.  But unfortunately, I could not
 find many place to learn it.


 Now for a specific question,

 I use read.csv to read some data from an excel data
 file (about 30,000 line data).  Some columns has empty
 data, so NA was read.  But they were read in as a
 factor instead of vector.  I need to manipulate them
 later as a vector (for example standardizing by
 dividing with standard deviation, or derive a new
 column from other two or more columns).  How to
 convert it into vector? Or maybe some functions
 already exists for factor already?


Check out the na.strings= and possibly the as.is = TRUE
arguments on read.table.  Also the read.xls command
from the gdata package may be helpful.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Duration labels on plot axes

2006-04-30 Thread Gabor Grothendieck
That should have been:

library(chron)
plot(times(0:23/24), 0:23)

On 4/30/06, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 The times class in chron will give hours and minutes. e.g.

 library(times)
 plot(times(0:23/23), 0:23)

 and you could modify chron:::axis.times for the others.

 On 4/30/06, Duncan Murdoch [EMAIL PROTECTED] wrote:
  I have a variable containing a measurement of a duration in seconds
  which I would like to use in a plot, with the axes labelled in a format
  like %H:%M:%S (or possibly %Hh%Mm) if the duration is more than an hour,
  or just %M:%S if more than a minute, or just decimal seconds if short
  enough.  Is there a class with an axis.* method defined that has
  behaviour something like this?
 
  Duncan Murdoch
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Yet another help needed

2006-04-30 Thread Gabor Grothendieck
Look at
?filter
?embed
rollmean in the zoo package
running in the gtools package
runmean in the caTools package

The last one is probably the fastest.

On 4/30/06, Guojun Zhu [EMAIL PROTECTED] wrote:
 I have a big data.frame with abou 20 column and 60,000
 rows for analyze.  Let us say I had a column a.  I
 want to generate a new column which value should be
 the average of the 60 a before the current column.
 Let us say very row is time t_i.  I need to calculate
 a(t_(i-60))+a(t(i-59)+...+a(t(i-1)).  Also I want to
 get another number by run regression of
 a(t(i-60)),..., a(t(i-1)) on b(t(i-60)),...,b(t(i-1)).
  Is there any simple densed code for this?  Thanks.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] table of means/medians across bins used for a histogram

2006-04-30 Thread Gabor Grothendieck
My understanding is that you want to replace each rate with its average
over the associated bin and then plot age against that.  In that
case try this:

 DF  # test data
age rate bin
1 0.002 10.0   A
2 0.045  0.1   B
3 0.130 15.0   A
4 0.150 34.0   D
 with(DF, plot(ave(rate, bin), age))

Assuming they
are stored in vectors
the columns are age, rate, bin we would have

plot(ave(clock, bin), age)

On 4/30/06, lalitha viswanath [EMAIL PROTECTED] wrote:
 Hi
 I am trying to get a table of means of parameter 1
 across BINS of parameter 2.

 I am working in proteomics and a sample of my data is
 as follows

 cluster-age clock-rate(evolutionary rate) scopclass
 0.002   10  A
 0.045   0.1 B
 0.1315  A
 0.1534  D
 
 
 
 

 Scop class has only 9 distinct categories (A-I)
 Whereas cluster-age and clock-rate are discrete
 variables greater than 0.

 I am trying to do two things with this kind of data,
 out of which I managed to accomplish one thanks to the
 documentation and pre-existing queries on the mailing
 lists.
 1. Plot a histogram of the age distribution with scop
 class category superimposed on each bin. I managed to
 do this with barplot2.
 2. Now I am trying to plot a scatter plot of the age
 v/s the clock-rate. However to eliminate possible
 sampling errors, we are trying to get an average of
 the clock-rate for each of the bins used above.
 i.e. before plotting a x-y plot, i wish to compute
 average clock-rate in each of the bins for the age and
 then plot a x-y plot of the age v/s clock rate.

 Can anyone point me to appropriate functions for the
 same?
 I am trying to work with prop.table, cut, break, etc.
 But I am not heading anywhere.

 Thanks
 Lalitha

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] table of means/medians across bins used for a histogram

2006-04-30 Thread Gabor Grothendieck
Or perhaps a bit simpler:

plot(age ~ ave(clock, bin), DF)


On 4/30/06, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 My understanding is that you want to replace each rate with its average
 over the associated bin and then plot age against that.  In that
 case try this:

  DF  # test data
age rate bin
 1 0.002 10.0   A
 2 0.045  0.1   B
 3 0.130 15.0   A
 4 0.150 34.0   D
  with(DF, plot(ave(rate, bin), age))

 Assuming they
 are stored in vectors
 the columns are age, rate, bin we would have

 plot(ave(clock, bin), age)

 On 4/30/06, lalitha viswanath [EMAIL PROTECTED] wrote:
  Hi
  I am trying to get a table of means of parameter 1
  across BINS of parameter 2.
 
  I am working in proteomics and a sample of my data is
  as follows
 
  cluster-age clock-rate(evolutionary rate) scopclass
  0.002   10  A
  0.045   0.1 B
  0.1315  A
  0.1534  D
  
  
  
  
 
  Scop class has only 9 distinct categories (A-I)
  Whereas cluster-age and clock-rate are discrete
  variables greater than 0.
 
  I am trying to do two things with this kind of data,
  out of which I managed to accomplish one thanks to the
  documentation and pre-existing queries on the mailing
  lists.
  1. Plot a histogram of the age distribution with scop
  class category superimposed on each bin. I managed to
  do this with barplot2.
  2. Now I am trying to plot a scatter plot of the age
  v/s the clock-rate. However to eliminate possible
  sampling errors, we are trying to get an average of
  the clock-rate for each of the bins used above.
  i.e. before plotting a x-y plot, i wish to compute
  average clock-rate in each of the bins for the age and
  then plot a x-y plot of the age v/s clock rate.
 
  Can anyone point me to appropriate functions for the
  same?
  I am trying to work with prop.table, cut, break, etc.
  But I am not heading anywhere.
 
  Thanks
  Lalitha
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] pulling items out of a lm() call

2006-05-01 Thread Gabor Grothendieck
Try this:

# test data
fo - y ~ female + I(age^2) + female:black + (age + education) * female

# create a list of form list(y = as.name(z.y), ...) for use with substitute
L - sapply(all.vars(fo), function(nm) as.name(paste(z, nm, sep = .)))
do.call(substitute, list(fo, L))

On 5/1/06, Andrew Gelman [EMAIL PROTECTED] wrote:
 I want to write a function to standardize regression predictors, which
 will require me to do some character-string manipulation to parse the
 variables in a call to lm() or glm().

 For example, consider the call
 lm (y ~ female + I(age^2) + female:black + (age + education)*female).

 I want to be able to parse this to pick out the input variables
 (female, age, black, education).  Then I can transform these as
 appropriate (to get z.female, z.age, etc), feed them back into the
 lm() function, and go from there.

 Does anyone know an easy way to pull out the variables?  I basically
 have to parse out the symbols +, :, *, and  , but there's also
 the problem of handling parentheses and the I() operator.

 Thanks!
 Andrew

 --
 Andrew Gelman
 Professor, Department of Statistics
 Professor, Department of Political Science
 [EMAIL PROTECTED]
 www.stat.columbia.edu/~gelman

 Statistics department office:
  Social Work Bldg (Amsterdam Ave at 122 St), Room 1016
  212-851-2142
 Political Science department office:
  International Affairs Bldg (Amsterdam Ave at 118 St), Room 731
  212-854-7075

 Mailing address:
  1255 Amsterdam Ave, Room 1016
  Columbia University
  New York, NY 10027-5904
  212-851-2142
  (fax) 212-851-2164

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to strip one term from a data.frame? + How to write long line in script?

2006-05-01 Thread Gabor Grothendieck
Using the built in data frame iris, which has 5 columns, regress
Sepal.Length against all other variables except the last one:

lm(Sepal.Length ~., iris[1:4])

On 5/1/06, Guojun Zhu [EMAIL PROTECTED] wrote:
 I need to run a regression with 14 normal variables
 and 20 dummy variables.  All the data is in a huge
 data.frame  df.  But there is some extra intermediate
 item in the same data.frame too.  It will be nice I
 can strip off those terms and run lm().  Also, is
 there a simple way to write the formula, for example,
 just specify the y term, all other term in data.frame
 should be x_i.  Or is there some kind of automatically
 way to build it?  like use ls(df) couple with some
 other functions?


 If I have to write it in the brute force way, I will
 need to write a real long line in script.  I am in
 windows.  I found the script does not work with a long
 line.  It will not work either if I break it into a
 few lines.  How to get rid of that? thanks.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] efficiency in merging two data frames

2006-05-01 Thread Gabor Grothendieck
Some functions that may be of help:

?aggregate.ts
?cbind
?merge

and in the zoo package

?as.yearmon
?as.yearqtr
?aggregate.zoo
?merge.zoo

On 5/1/06, Guojun Zhu [EMAIL PROTECTED] wrote:
 I have two data sets about lots of companies' stock
 and fiscal data.  One is monthly data with about
 144,000 lines, and the other is quaterly with about
 56,000.  Each data set takes different company code.
 I need to merge these two together.  I read both ask
 cvs.  And the other file with corresponding firm code.
  Now I have three data sets. return$PERMNO,
 account$GVKEY.  id is the data frames of the
 corresponding relation and has both id$PERMNO and
 id$GVKEY.  Also, I need to convert the return's month
 into quarter and finally merge two data frames(return
 and account).  I end up write a short program for
 this, but it runs very slow.  15+ minutes.  Is there
 quick way to do it.  Here is my original codes.



 id$fy=rep(0,length(id$PERMNO))
 for (i in 1:length(id$PERMNO))

 id$fy[[i]]-account$FYR[id$GVKEY[[i]]==account$GVKEY][[1]]

 return$GVKEY=rep(0,length(return$PERMNO))
 return$fyy=rep(0,length(return$PERMNO))
 return$fyq=rep(0,length(return$PERMNO))
 for (i in i:length(return$PERMNO)) {
temp-id$PERMNO==return$PERMNO[[i]];
tempmon-id$fy[temp][[1]];
if (return$month[[i]]-tempmon) {
return$fyy[[i]]-return$year[[i]];
return$fyq[[i]]-4-(tempmon-return$month[[i]])%/%3;
}
  else{
return$fyy[[i]]-return$year[[i]]+1;
return$fyq[[i]]-(return$month[[i]]-tempmon-1)%/%3;
}
return$GVKEY[[i]]-id$GVKEY[temp][[1]];
 }

 returnnew=merge(return,account,by.x-c(GVKEY,fyy,fyq),by.y-c(GVKEY,fyy,fyq))

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to specify function arguments that are used in different places

2006-05-01 Thread Gabor Grothendieck
You could have a list of args for each one like this:

# test data
x - list(data = c(1,3,5), points = c(2,4))

myfunc - function(x, plot.args = NULL, points.args = NULL) {
do.call(plot, c(list(x$data), plot.args))
do.call(points, c(list(x$points), points.args))
}
myfunc(x, plot.args = list(col = red), points.args = list(col = blue))


On 5/1/06, Gregor Gorjanc [EMAIL PROTECTED] wrote:
 Hello!

 Subject is not very clear, but I hope my question will be;) I wrote a
 function, which produces a plot and I have problems with arguments. For
 the sake of example let us consider that my function looks like this

 myfunc - function(x, points=FALSE, lines=FALSE, ...)
 {
  ## x is an object that is being plotted
  plot(x$plotData, ...)
  ## one can also add some data on graph via points
  points(x$pointsData, ...)
  ## one can also add some data on graph via lines
  lines(x$linesData, ...)
 }

 My problem is in ... argument. plot(), points() and lines() have so
 many possible arguments, which is very nice, but how can I deal with
 them in my scenario. For example, I might want to specify red color for
 plot, blue for points and green for lines. Is it possible to handle such
 a mixture, without specifiying zillion of arguments such as plotCol,
 pointsCol, linesCol etc.? Perhaps something like ~ points$...?

 Thanks!

 --
 Lep pozdrav / With regards,
Gregor Gorjanc

 --
 University of Ljubljana PhD student
 Biotechnical Faculty
 Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan
 Groblje 3   mail: gregor.gorjanc at bfro.uni-lj.si

 SI-1230 Domzale tel: +386 (0)1 72 17 861
 Slovenia, Europefax: +386 (0)1 72 17 888

 --
 One must learn by doing the thing; for though you think you know it,
  you have no certainty until you try. Sophocles ~ 450 B.C.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] table of means/medians across bins used for a histogram

2006-05-01 Thread Gabor Grothendieck
I assume you want to discretize one column and then for
each level produced, calculate the mean of another column
and plot those means against the levels.

Using the builtin iris data frame discretize Sepal.Width producing the
SWfac factor and calculate, SLmean, the mean Sepal.Length for each
level of that factor.  Then plot using custom x axis:

SWfac - cut(iris$Sepal.Width, seq(2, 4.4, .5))
SLmean - tapply(iris$Sepal.Length, SWfac, mean)

plot(SLmean, xaxt = n)
axis(1, seq(SLmean), levels(SWfac))


On 5/1/06, lalitha viswanath [EMAIL PROTECTED] wrote:
 Hi
 I think I seem to have phrased my doubt incorrectly.
 I want a x-y plot of age v/s rate (the bin is
 irrelevant for this plot); only that instead of a
 simple x-y plot, i want a plot of average(rate) for
 each age-intervals.

 My ages vary from 0 to 0.7 and I want to divide them
 in groups of 0.02.

 So I want a plot of the following
 Age-intervalsAverage rate in that interval
 0-0.025
 0.02-0.04 7
 0.04-0.06 1
 0.06-0.08 0
 0.08-0.1  0.15

 Age-intervals mentioned along the x-axis (like for a
 histogram) and rates plotted for each age-interval

 --- Gabor Grothendieck [EMAIL PROTECTED]
 wrote:

  Or perhaps a bit simpler:
 
  plot(age ~ ave(clock, bin), DF)
 
 
  On 4/30/06, Gabor Grothendieck
  [EMAIL PROTECTED] wrote:
   My understanding is that you want to replace each
  rate with its average
   over the associated bin and then plot age against
  that.  In that
   case try this:
  
DF  # test data
  age rate bin
   1 0.002 10.0   A
   2 0.045  0.1   B
   3 0.130 15.0   A
   4 0.150 34.0   D
with(DF, plot(ave(rate, bin), age))
  
   Assuming they
   are stored in vectors
   the columns are age, rate, bin we would have
  
   plot(ave(clock, bin), age)
  
   On 4/30/06, lalitha viswanath
  [EMAIL PROTECTED] wrote:
Hi
I am trying to get a table of means of parameter
  1
across BINS of parameter 2.
   
I am working in proteomics and a sample of my
  data is
as follows
   
cluster-age clock-rate(evolutionary rate)
  scopclass
0.002   10  A
0.045   0.1 B
0.1315  A
0.1534  D




   
Scop class has only 9 distinct categories (A-I)
Whereas cluster-age and clock-rate are discrete
variables greater than 0.
   
I am trying to do two things with this kind of
  data,
out of which I managed to accomplish one thanks
  to the
documentation and pre-existing queries on the
  mailing
lists.
1. Plot a histogram of the age distribution with
  scop
class category superimposed on each bin. I
  managed to
do this with barplot2.
2. Now I am trying to plot a scatter plot of the
  age
v/s the clock-rate. However to eliminate
  possible
sampling errors, we are trying to get an average
  of
the clock-rate for each of the bins used above.
i.e. before plotting a x-y plot, i wish to
  compute
average clock-rate in each of the bins for the
  age and
then plot a x-y plot of the age v/s clock rate.
   
Can anyone point me to appropriate functions for
  the
same?
I am trying to work with prop.table, cut, break,
  etc.
But I am not heading anywhere.
   
Thanks
Lalitha
   
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
   
  
 


 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam protection around
 http://mail.yahoo.com


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Adding elements in an array where I have missing data.

2006-05-01 Thread Gabor Grothendieck
Here are a few alternatives:

replace(a, is.na(a), 0) + b

ifelse(is.na(a), 0, a) + b

mapply(sum, a, b, MoreArgs = list(na.rm = TRUE))

On 5/1/06, John Kane [EMAIL PROTECTED] wrote:
 This is a simple question but I cannot seem to find
 the answer.
 I have two vectors but with missing data and I want to
 add them together with
 the NA's being ignored.

 Clearly I need to get the NA ignored.  na.action?

 I have done some searching and cannot get na.action to
 help.
 This must be a common enough issue that the answer is
 staring me in the face
 but I just don't see it.

 Simple example
 a - c(2, NA, 3)
 b - c(3,4, 5)

 What I want is
 c - a + b where
 c  is ( 5 , 4 ,8)

 However I get
 c is (5,NA, 8)

 What am I missing?  Or do I somehow need to recode the
 NA's as missing?

 Thanks

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Pasting data into scan()

2006-05-01 Thread Gabor Grothendieck
On 5/1/06, Murray Jorgensen [EMAIL PROTECTED] wrote:
 The file TENSILE.DAT from the Hand et al Handbook of Small Data Sets
 looks like this:

 0.023   0.032   0.054   0.069   0.081   0.094
 0.105   0.127   0.148   0.169   0.188   0.216
 0.255   0.277   0.311   0.361   0.376   0.395
 0.432   0.463   0.481   0.519   0.529   0.567
 0.642   0.674   0.752   0.823   0.887   0.926

 except that my mail client has replaced the tab separators by blanks. If
 I paste this data into R 2.2.1 what I get is

   strength - scan()
 1: 0.0230.0320.0540.0690.0810.094
 1: 0.1050.1270.1480.1690.1880.216
 Error in scan() : scan() expected 'a real', got
 '0.0230.0320.0540.0690.0810.094'
   0.2550.2770.3110.3610.3760.395
 Error: syntax error in 0.2550.2770
   0.4320.4630.4810.5190.5290.567
 Error: syntax error in 0.4320.4630
   0.6420.6740.7520.8230.8870.926
 Error: syntax error in 0.6420.6740

 Aha! I thought, what I need is scan(sep = \t)
 but this generates the same error messages.

1. If your situation is that you have separators
but don't know what they are try this.
It replaces all characters that don't appear
in numbers with a space:

L - readLines(clipboard)
L - gsub([^-0-9.],  , L)
scan(textConnection(L))

2. If the separators are completely lost you may still be
able to recover the data if you can assume that every
number is of the form d.ddd where d is a digt.  Just
search for that pattern and replace it with itself
and a space:

L - readLines(clipboard)
L - gsub(([0-9][.][0-9][0-9][0-9]), \\1 , L)
scan(textConnection(L))

3. Doing a google search for tensile.dat finds a data set
that looks like yours.  Try this:

URL - http://statistics.byu.edu/resources/files/datasets/tensile.dat;
scan(URL)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Still help needed on embeded regression

2006-05-02 Thread Gabor Grothendieck
Using runmean from caTools the first one below does
it in under 1 second but will not handle NAs.  The
second one takes under 15 seconds and handles
them by replacing them with linear approximations.
Note that k must be odd.

# 1

library(caTools)
set.seed(1)
system.time({
y - rnorm(140001)
x - as.numeric(seq(y))
k - 61
Mxy - runmean(x * y, k)
Mxx - runmean(x * x, k)
Mx - runmean(x, k)
My - runmean(y, k)
b - (Mxy - Mx * My) / (Mxx - Mx * Mx)
a - My - b * Mx
})

# 2

library(caTools)
library(zoo)
set.seed(1)
system.time({
y - rnorm(14)
x - as.numeric(seq(y))
x[100:200] - NA
x - na.approx(zoo(x))
y - zoo(y)
k - 60
Mxy - runmean(x * y, k)
Mxx - runmean(x * x, k)
Mx - runmean(x, k)
My - runmean(y, k)
b - (Mxy - Mx * My) / (Mxx - Mx * Mx)
a - My - b * Mx
})


On 5/1/06, Guojun Zhu [EMAIL PROTECTED] wrote:
 I basically has a long data.frame a.  but I only need
 three columns x,y. Let us say the index of row is t.
 I need to produce new column s_t as the linear
 regression coefficient of (x_(t-60),...x_(t-1)) on
 (y_(t-60),...,y_(t-1)). The data is about 140,000
 rows.  I wrote a simple code on this which is super
 slow, it takes more than 2 hours on a 2.8Ghz Intel Duo
 Core.  My friend use SAS and his code needs only
 couple of minutes.  I know there must be some more
 efficient way to write it.  Can anyone help me on
 this?  Here is the code.

 Also one line produce a complete NA temp$y and lm
 function failed on that.  How to make it just produce
 a NA instead and keep runing?

 attach(return)
 betat=rep(NA,length(RET))
 for (i in 61:length(RET)){cat(i, );
 if (year[[i]]=1995){

 temp-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)])

 betat[[i]]-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]]
  #if (i%%100==0)
 cat(i, );


 return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]]
 }
 }

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Still help needed on embeded regression

2006-05-02 Thread Gabor Grothendieck
Try

runmean2 - function(x, k) # k must be even
(coredata(runmean(x, k-1)) * (k-1) +
coredata(lag(x, -k/2, na.pad = TRUE)))/k

Also, in your code use matrices or vectors instead of data frames
to avoid any overhead in using data frames.

On 5/2/06, Guojun Zhu [EMAIL PROTECTED] wrote:
 Sorry to bother you guys again. This is great.  But
 this is for 61 number and the second case will change
 60 to 61.  run* only accept odd number window. How
 to get around it with 60?  Any suggestion? Thanks.

 --- Gabor Grothendieck [EMAIL PROTECTED]
 wrote:

  Using runmean from caTools the first one below does
  it in under 1 second but will not handle NAs.  The
  second one takes under 15 seconds and handles
  them by replacing them with linear approximations.
  Note that k must be odd.
 
  # 1
 
  library(caTools)
  set.seed(1)
  system.time({
y - rnorm(140001)
x - as.numeric(seq(y))
k - 61
Mxy - runmean(x * y, k)
Mxx - runmean(x * x, k)
Mx - runmean(x, k)
My - runmean(y, k)
b - (Mxy - Mx * My) / (Mxx - Mx * Mx)
a - My - b * Mx
  })
 
  # 2
 
  library(caTools)
  library(zoo)
  set.seed(1)
  system.time({
y - rnorm(14)
x - as.numeric(seq(y))
x[100:200] - NA
x - na.approx(zoo(x))
y - zoo(y)
k - 60
Mxy - runmean(x * y, k)
Mxx - runmean(x * x, k)
Mx - runmean(x, k)
My - runmean(y, k)
b - (Mxy - Mx * My) / (Mxx - Mx * Mx)
a - My - b * Mx
  })
 
 
  On 5/1/06, Guojun Zhu [EMAIL PROTECTED] wrote:
   I basically has a long data.frame a.  but I only
  need
   three columns x,y. Let us say the index of row is
  t.
   I need to produce new column s_t as the linear
   regression coefficient of (x_(t-60),...x_(t-1)) on
   (y_(t-60),...,y_(t-1)). The data is about 140,000
   rows.  I wrote a simple code on this which is
  super
   slow, it takes more than 2 hours on a 2.8Ghz Intel
  Duo
   Core.  My friend use SAS and his code needs only
   couple of minutes.  I know there must be some more
   efficient way to write it.  Can anyone help me on
   this?  Here is the code.
  
   Also one line produce a complete NA temp$y and lm
   function failed on that.  How to make it just
  produce
   a NA instead and keep runing?
  
   attach(return)
   betat=rep(NA,length(RET))
   for (i in 61:length(RET)){cat(i, );
   if (year[[i]]=1995){
  
  
 
 temp-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)])
  
  
 
 betat[[i]]-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]]
#if (i%%100==0)
   cat(i, );
  
  
  
 
 return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]]
   }
   }
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide!
  http://www.R-project.org/posting-guide.html
  
 


 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam protection around
 http://mail.yahoo.com


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] evaluation of expressions

2006-05-02 Thread Gabor Grothendieck
Try this:

e - expression(glm(y ~ age))
eval(e)

or this:

chr - glm(y ~ age)
eval(parse(text = chr))

On 5/2/06, Andrew Gelman [EMAIL PROTECTED] wrote:
 Hi, all.  I'm trying to automate some regression operations in R but am
 confused about how to evaluate expressoins that are expressed as
 character strings.  For example:

 y - ifelse (rnorm(10)0, 1, 0)
 sex - rnorm(10)
 age - rnorm(10)
 test - as.data.frame (cbind (y, sex, age))

 # this works fine:
 glm (y ~ sex + I(age^2), data=test, family=binomial(link=logit),
 subset=age1)

 # but now I want to do it in two steps:
 expr - 'glm (y ~ sex + I(age^2), data=test,
 family=binomial(link=logit), subset=age1)'

 Given expr, defined above, how can I evaluate it?  I played around
 with eval() and as.expression() but can't figure it out.

 Thanks.
 Andrew

 --
 Andrew Gelman
 Professor, Department of Statistics
 Professor, Department of Political Science
 [EMAIL PROTECTED]
 www.stat.columbia.edu/~gelman

 Statistics department office:
  Social Work Bldg (Amsterdam Ave at 122 St), Room 1016
  212-851-2142
 Political Science department office:
  International Affairs Bldg (Amsterdam Ave at 118 St), Room 731
  212-854-7075

 Mailing address:
  1255 Amsterdam Ave, Room 1016
  Columbia University
  New York, NY 10027-5904
  212-851-2142
  (fax) 212-851-2164

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Adding elements in an array where I have missing data.

2006-05-02 Thread Gabor Grothendieck
On 5/2/06, Berton Gunter [EMAIL PROTECTED] wrote:
 
  Here are a few alternatives:
 
  replace(a, is.na(a), 0) + b
 
  ifelse(is.na(a), 0, a) + b
 
  mapply(sum, a, b, MoreArgs = list(na.rm = TRUE))
 

 Well, Gabor, if you want to get fancy...

 evalq({a[is.na(a)]-0;a})+b


Note that the evalq can be omitted:

   { a[is.na] - 0; a } + b

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Adding elements in an array where I have missing data.

2006-05-02 Thread Gabor Grothendieck
But the evalq solution does change a.

 a - c(2, NA, 3)
 b - c(3,4, 5)
 evalq({a[is.na(a)]-0;a})+b
[1] 5 4 8
 a
[1] 2 0 3

If evalq were changed to local then it would not change a:

 a - c(2, NA, 3)
 b - c(3,4, 5)
 local({a[is.na(a)]-0;a})+b
[1] 5 4 8
 a
[1]  2 NA  3

Also the replace, ifelse and mapply solutions do not change a.


On 5/2/06, Berton Gunter [EMAIL PROTECTED] wrote:
 Below.

  -Original Message-
  From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, May 02, 2006 10:42 AM
  To: Berton Gunter
  Cc: John Kane; R R-help
  Subject: Re: [R] Adding elements in an array where I have
  missing data.
 
  On 5/2/06, Berton Gunter [EMAIL PROTECTED] wrote:
   
Here are a few alternatives:
   
replace(a, is.na(a), 0) + b
   
ifelse(is.na(a), 0, a) + b
   
mapply(sum, a, b, MoreArgs = list(na.rm = TRUE))
   
  
   Well, Gabor, if you want to get fancy...
  
   evalq({a[is.na(a)]-0;a})+b
  
 
  Note that the evalq can be omitted:
 
 { a[is.na] - 0; a } + b
 

 No it can't. The idea is **not** to change the original a.

 -- Bert



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Use predict.lm

2006-05-02 Thread Gabor Grothendieck
Try this:

# regression of Sepal.Length on cols 2 and 4 using first 100 rows
iris.lm - lm(Sepal.Length ~ ., iris[,c(1,2,4)], subset = 1:100)

# now do it with next 50 rows
predict(update(iris.lm, subset = 101:150))

# double check - this gives same result as last line
predict(lm(Sepal.Length ~ ., iris[,c(1,2,4)], subset = 101:150))


On 5/2/06, Jiang, Jincai (Institutional Securities Management)
[EMAIL PROTECTED] wrote:
 Hi All,

 I created a two variable lm() model

 slm-lm(y[1:3000,8]~y[1:3000,12]+y[1:3000,15])

 I made two predictions

 predict(slm,newdata=y[201:3200,])
 predict(slm,newdata=y[601:3600,])

 there is no error message for either of these.
 the results are identical, and identical to slm$fitted as well.

 if this is not the right way to apply the model coefficients to a new
 set of inputs, what is the right way?

 Thank you

 Regards,

 Jincai Jiang
 (Office) 212-761-3984

 
 This is not an offer (or solicitation of an offer) to buy/sell the 
 securities/instruments mentioned. Morgan Stanley may deal as principal in or 
 own or act as market maker for securities/instruments mentioned or may advise 
 the issuers. Any ModelWare, research or other information referenced herein 
 is subject to the ClientLink and ModelWare terms of use including all 
 applicable disclosures and disclaimers. The information provided speaks only 
 as of its date. We have not undertaken, and will not undertake, any duty to 
 update the information or otherwise advise you of changes in our opinion or 
 in the research or information. Continued access to the research and other 
 information is provided for your convenience only, and is not a republication 
 or reconfirmation of the opinions or information contained therein. For 
 additional information and important disclosures, contact me or see the 
 ModelWare website. Past performance is not indicative of future returns. This 
 communication i!
 s !
  solely for the addressee(s) and may contain confidential information. We do 
 not waive confidentiality by mistransmission. Contact me if you do not wish 
 to receive these communications. In the UK, this communication is directed in 
 the UK to those persons who are market counterparties or intermediate 
 customers (as defined in the UK Financial Services Authority's rules).

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Time series plot

2006-05-02 Thread Gabor Grothendieck
Try this (where you can replace textConnection(L) with name
of file containing data):

L - 01/02/1990 0.531 0.479
01/03/1990 0.510 0.522
01/06/1990 0.602 0.604

library(zoo)
z - read.zoo(textConnection(L), format = %m/%d/%Y)
plot(z, plot.type = single)

This will give more info on zoo:

library(zoo)
vignette(zoo)
library(help = zoo)




On 5/2/06, Jiang, Jincai (Institutional Securities Management)
[EMAIL PROTECTED] wrote:
 I have some time series data like

 01/02/1990 0.531 0.479
 01/03/1990 0.510 0.522
 01/06/1990 0.602 0.604

 there is no weekends and holidays.
 how do I graph them in a single plot that the x-axis is the dates and
 the y-axis is the time series?
 Thank you

 Regards,

 Jincai Jiang
 (Office) 212-761-3984

 
 This is not an offer (or solicitation of an offer) to buy/sell the 
 securities/instruments mentioned. Morgan Stanley may deal as principal in or 
 own or act as market maker for securities/instruments mentioned or may advise 
 the issuers. Any ModelWare, research or other information referenced herein 
 is subject to the ClientLink and ModelWare terms of use including all 
 applicable disclosures and disclaimers. The information provided speaks only 
 as of its date. We have not undertaken, and will not undertake, any duty to 
 update the information or otherwise advise you of changes in our opinion or 
 in the research or information. Continued access to the research and other 
 information is provided for your convenience only, and is not a republication 
 or reconfirmation of the opinions or information contained therein. For 
 additional information and important disclosures, contact me or see the 
 ModelWare website. Past performance is not indicative of future returns. This 
 communication i!
 s !
  solely for the addressee(s) and may contain confidential information. We do 
 not waive confidentiality by mistransmission. Contact me if you do not wish 
 to receive these communications. In the UK, this communication is directed in 
 the UK to those persons who are market counterparties or intermediate 
 customers (as defined in the UK Financial Services Authority's rules).

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Use predict.lm

2006-05-02 Thread Gabor Grothendieck
Sorry, I don't think that my earlier reply was what you wanted.
Try this instead:

# fit using first 100 points
iris.lm - lm(Sepal.Length ~., iris[1:100,c(1,2,4)])

# predict using coefficients from above and variables from next 50 points
predict(iris.lm, iris[101:150, c(1,2,4)])


On 5/2/06, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 Try this:

 # regression of Sepal.Length on cols 2 and 4 using first 100 rows
 iris.lm - lm(Sepal.Length ~ ., iris[,c(1,2,4)], subset = 1:100)

 # now do it with next 50 rows
 predict(update(iris.lm, subset = 101:150))

 # double check - this gives same result as last line
 predict(lm(Sepal.Length ~ ., iris[,c(1,2,4)], subset = 101:150))


 On 5/2/06, Jiang, Jincai (Institutional Securities Management)
 [EMAIL PROTECTED] wrote:
  Hi All,
 
  I created a two variable lm() model
 
  slm-lm(y[1:3000,8]~y[1:3000,12]+y[1:3000,15])
 
  I made two predictions
 
  predict(slm,newdata=y[201:3200,])
  predict(slm,newdata=y[601:3600,])
 
  there is no error message for either of these.
  the results are identical, and identical to slm$fitted as well.
 
  if this is not the right way to apply the model coefficients to a new
  set of inputs, what is the right way?
 
  Thank you
 
  Regards,
 
  Jincai Jiang
  (Office) 212-761-3984
 
  
  This is not an offer (or solicitation of an offer) to buy/sell the 
  securities/instruments mentioned. Morgan Stanley may deal as principal in 
  or own or act as market maker for securities/instruments mentioned or may 
  advise the issuers. Any ModelWare, research or other information referenced 
  herein is subject to the ClientLink and ModelWare terms of use including 
  all applicable disclosures and disclaimers. The information provided speaks 
  only as of its date. We have not undertaken, and will not undertake, any 
  duty to update the information or otherwise advise you of changes in our 
  opinion or in the research or information. Continued access to the research 
  and other information is provided for your convenience only, and is not a 
  republication or reconfirmation of the opinions or information contained 
  therein. For additional information and important disclosures, contact me 
  or see the ModelWare website. Past performance is not indicative of future 
  returns. This communication!
  is !
   solely for the addressee(s) and may contain confidential information. We 
  do not waive confidentiality by mistransmission. Contact me if you do not 
  wish to receive these communications. In the UK, this communication is 
  directed in the UK to those persons who are market counterparties or 
  intermediate customers (as defined in the UK Financial Services Authority's 
  rules).
 
 [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Still help needed on embeded regression

2006-05-02 Thread Gabor Grothendieck
I was assuming that this would be added to my example
where the data is a zoo object so lag.zoo is being used.
Try this:

 library(zoo)
 z - zoo(11:15)
 z
 1  2  3  4  5
11 12 13 14 15
 lag(z,-1,na.pad=TRUE)
 1  2  3  4  5
NA 11 12 13 14
 ?lag.zoo


On 5/2/06, Guojun Zhu [EMAIL PROTECTED] wrote:
 It does not work though.  How is the lag work? How
 does  the lag work?  I read the help and do not quite
 understand. Here is a test

  y
  [1]  1  2  3  4  5  6  7  8  9 10
  coredata(lag(y,-1))
  [1]  1  2  3  4  5  6  7  8  9 10
 attr(,tsp)
 [1]  2 11  1

 --- Gabor Grothendieck [EMAIL PROTECTED]
 wrote:

  Try
 
runmean2 - function(x, k) # k must be even
(coredata(runmean(x, k-1)) * (k-1) +
coredata(lag(x, -k/2, na.pad = TRUE)))/k
 
  Also, in your code use matrices or vectors instead
  of data frames
  to avoid any overhead in using data frames.
 
  On 5/2/06, Guojun Zhu [EMAIL PROTECTED] wrote:
   Sorry to bother you guys again. This is great.
  But
   this is for 61 number and the second case will
  change
   60 to 61.  run* only accept odd number window.
  How
   to get around it with 60?  Any suggestion? Thanks.
  
   --- Gabor Grothendieck [EMAIL PROTECTED]
   wrote:
  
Using runmean from caTools the first one below
  does
it in under 1 second but will not handle NAs.
  The
second one takes under 15 seconds and handles
them by replacing them with linear
  approximations.
Note that k must be odd.
   
# 1
   
library(caTools)
set.seed(1)
system.time({
  y - rnorm(140001)
  x - as.numeric(seq(y))
  k - 61
  Mxy - runmean(x * y, k)
  Mxx - runmean(x * x, k)
  Mx - runmean(x, k)
  My - runmean(y, k)
  b - (Mxy - Mx * My) / (Mxx - Mx * Mx)
  a - My - b * Mx
})
   
# 2
   
library(caTools)
library(zoo)
set.seed(1)
system.time({
  y - rnorm(14)
  x - as.numeric(seq(y))
  x[100:200] - NA
  x - na.approx(zoo(x))
  y - zoo(y)
  k - 60
  Mxy - runmean(x * y, k)
  Mxx - runmean(x * x, k)
  Mx - runmean(x, k)
  My - runmean(y, k)
  b - (Mxy - Mx * My) / (Mxx - Mx * Mx)
  a - My - b * Mx
})
   
   
On 5/1/06, Guojun Zhu [EMAIL PROTECTED]
  wrote:
 I basically has a long data.frame a.  but I
  only
need
 three columns x,y. Let us say the index of row
  is
t.
 I need to produce new column s_t as the linear
 regression coefficient of
  (x_(t-60),...x_(t-1)) on
 (y_(t-60),...,y_(t-1)). The data is about
  140,000
 rows.  I wrote a simple code on this which is
super
 slow, it takes more than 2 hours on a 2.8Ghz
  Intel
Duo
 Core.  My friend use SAS and his code needs
  only
 couple of minutes.  I know there must be some
  more
 efficient way to write it.  Can anyone help me
  on
 this?  Here is the code.

 Also one line produce a complete NA temp$y and
  lm
 function failed on that.  How to make it just
produce
 a NA instead and keep runing?

 attach(return)
 betat=rep(NA,length(RET))
 for (i in 61:length(RET)){cat(i, );
 if (year[[i]]=1995){


   
  
 
 temp-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)])


   
  
 
 betat[[i]]-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]]
  #if (i%%100==0)
 cat(i, );



   
  
 
 return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]]
 }
 }

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

   
  
  
   __
   Do You Yahoo!?
   Tired of spam?  Yahoo! Mail has the best spam
  protection around
   http://mail.yahoo.com
  
 


 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam protection around
 http://mail.yahoo.com


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Listing Variables

2006-05-03 Thread Gabor Grothendieck
Column names in iris that contain the string Sepal:

grep(Sepal, names(iris), value = TRUE)


On 5/3/06, Farrel Buchinsky [EMAIL PROTECTED] wrote:
 How does one create a vector whose contents is the list of variables in a
 dataframe pertaining to a particular pattern?
 This is so simple but I cannot find a straightforward answer.
 I want to be able to pass the contents of that list to a for loop.

 So let us assume that one has a dataframe whose name is Data. And let us
 assume one had the height of a group of people measured at various ages.

 It could be made up of vectors Data$PersonalID, Data$FirstName,
 Data$LastName, Data$Height.1, Data$Height.5, Data$Height.9,
 Data$Height.10,Data$Height.12,Data$Height.20many many more variables.

 How would one create a vector of all the Height variable names.

 The simple workaround is to not bother creating the vector Data$Height.1
 Data$Height.5 Data$Height.9 Data$Height.10
 Data$Height.12Data$Height.20...but rather just to use the sapply
 function. However with some functions the sapply will not work and it is
 necessary to supply each variable name to a function (see thread at
 Repeating tdt function on thousands of variables)


 This is such a core capability. I would like to see it in the R-Wiki but
 could not find it there.

 --
 Farrel Buchinsky, MD
 Pediatric Otolaryngologist
 Allegheny General Hospital
 Pittsburgh, PA

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Math expressions in pie chart labels?

2006-05-03 Thread Gabor Grothendieck
On 5/3/06, Uwe Ligges [EMAIL PROTECTED] wrote:
 Johannes Graumann wrote:

  On Tuesday 02 May 2006 23:33, Uwe Ligges wrote:
 
 Then please read ?plotmath and use it:
 
 labels = expression( = 0.66,  == 0.33,  = -0.33,  = -0.66)
 
 
  Error in lab !=  : comparison is not allowed for expressions
  In addition: Warning message:
  is.na() applied to non-(list or vector) in: is.na(lab - labels[i])
 
  I don't seem to be the only one having problems with this ;0)


 Then please tell us the details, I just tried successfully:

 plot(1:10, xaxt=n)
 axis(1, at = c(1,3,5,7), labels =
  expression( = 0.66,  == 0.33,  = -0.33,  = -0.66))


I think the discussion applies to pie:

 pie(c(1,3,5,7), labels =
+  expression( = 0.66,  == 0.33,  = -0.33,  = -0.66))
Error in lab !=  : comparison is not allowed for expressions
In addition: Warning message:
is.na() applied to non-(list or vector) in: is.na(lab - labels[i])

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sprintf question

2006-05-03 Thread Gabor Grothendieck
Try this:

do.call(sprintf, c(%9.2f\t%d\t%d\t%8.3f, as.list(v[iv])))


On 5/3/06, Paul Roebuck [EMAIL PROTECTED] wrote:
 How would one go about getting sprintf to use the
 values of a vector without having to specify each
 argument individually?

  v - c(1, 2, -1.197114, 0.1596687)
  iv - c(3, 1, 2, 4)
  sprintf(%9.2f\t%d\t%d\t%8.3f, v[3], v[1], v[2], v[4])
 [1] -1.20\t1\t2\t   0.160

 Essentially, desired effect would be something like:
  sprintf(%9.2f\t%d\t%d\t%8.3f, v[iv]) # wish it worked

 --
 SIGSIG -- signature too long (core dumped)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] factor to real - best way to convert

2006-05-03 Thread Gabor Grothendieck
You can use as.is = TRUE arg to read.xls to get character
data rather than factors.

On 5/3/06, Knut Krueger [EMAIL PROTECTED] wrote:
 I have got  factor from read.xls:
   is(factor_value)
 [1] factor   oldClass

 
 [288] -0.32   0.180.180.18-0.32   0.180.68
 [295] 0.680.18
 43 Levels: -0.05 -0.13 -0.15 -0.18 -0.20 -0.26 ... 1.33

 If I am using the funciton as.real(factor_value)

 I get
 
 [271] 17 17  8 22  8  8 17 17 17 17 17 17 17 17 23  7 35  7
 [289] 23 23 23  7 23 35 35 23

 So I used as.real(as.matrix(factor_value))
 The result is as expected:
 [271]NANA -0.35  0.15 -0.35 -0.35NANANA
 [280]NANANANANA  0.18 -0.32  0.68 -0.32
 [289]  0.18  0.18  0.18 -0.32  0.18  0.68  0.68  0.18

 Ok I found the way to convert with try and error, but I do not understand the 
 way
 -
 and I found the hint in the fullref_manual:

 x- as.numeric(levels(factor_value))[factor_value])

 Ok much better, but I would not be able to find the way from the
 ?as.numeric help page.

 Both versions are complete struggled in my mind.

 maybe anybody is albe to write some hints for me.



 with regards
 Knut

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Math expressions in pie chart labels?

2006-05-03 Thread Gabor Grothendieck
As a workaround you could use pie3D in the plotrix package
with height=0 and theta=pi, e.g.

library(plotrix)
pie3D(1:3, height = 0, theta = pi,
labels = expression( = 1,  == 2,  = 3))

On 5/3/06, Johannes Graumann [EMAIL PROTECTED] wrote:
 On Wednesday 03 May 2006 09:05, Uwe Ligges wrote:
  Ah, I see, this happens in pie()'s line:
 
 if (!is.na(lab - labels[i])  lab != ) {
 
  where lab is one element of the expression.
  I'd like to propose to change that line to
 
 if (!is.na(lab - labels[i])  nchar(lab)  0) {

 What's the canonical way of patching something like this in R? Redefining the
 function at the start of your script?

 Joh




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Aggregate?

2006-05-03 Thread Gabor Grothendieck
Suppose we want to sum C over levels of A and that B is constant
within levels of A.  Then:

DF - data.frame(A = gl(2,2), B = gl(2,2), C = 1:4)  # test data
do.call(rbind, by(DF, DF$A, function(x) replace(x[1,], C, sum(x$C



On 5/3/06, Guenther, Cameron [EMAIL PROTECTED] wrote:
 Hello,

 I have a data set with a grouping variable (TRIPID) and  several other
 variables.  TRIPID is repeated in some areas and I would like to use a
 function like aggregate to sum the variable UNITS according to TRIPID.
 However I would also like to retain the other variables as they are in
 the data set with the new summed TRIPID.

 So what I have is something like this:

 YEARMONTH   DAY CONTINUESPL AREACOUNTY  DEPTH
 DEPUNIT GEARGEAR2   TRAPS   SOAKTIMEUNITS   FACTOR  DISPOSIT
 NUMSETS TRIPST  TRIPID
 19921   26  1 SP0073928   8
 25 4   NA  100 NA  NA
 NA  161 1   NA  NA
 NA  02163399054 19921   26
 1 SP0073928   8 25 4   NA
 100 NA  NA  NA  8
 1   NA  NA  NA  02163399054
 19921   26  2 SP0004228   8
 25 4   NA  100 NA  NA
 NA  161 1   NA  NA
 NA  02163399054  19921   26
 2 SP0004228   8 25 4   NA
 100 NA  NA  NA  8
 1   NA  NA  NA  02163399054
 19921   25  NA  SP0052652   8
 25 4   NA  100 NA  NA
 NA  85  1   NA  NA
 NA  02163399057   19921   26
 NA  SP0037940   8 25 4   NA
 100 NA  NA  NA  70
 1   NA  NA  NA  02163399058
 19921   27  NA  SP0072357   8
 25 4   NA  100 NA  NA
 NA  15  1   NA  NA
 NA  02163399059
 19921   27  NA  SP0072357   8
 25 4   NA  100 NA  NA
 NA  20  1   NA  NA
 NA  02163399059 19921   27
 NA  SP0026324   8 25 4   NA
 100 NA  NA  NA  8
 1   NA  NA  NA  02163399060
 19921   28  1 SP0072357   8
 25 4   NA  100 NA  NA
 NA  2001   NA  NA
 NA  02163399062

 And what I want is this:

 YEARMONTH   DAY CONTINUESPL AREACOUNTY  DEPTH
 DEPUNIT GEARGEAR2   TRAPS   SOAKTIMEUNITS   FACTOR  DISPOSIT
 NUMSETS TRIPST  TRIPID
 19921   26  1 SP0073928   8
 25 4   NA  100 NA  NA
 NA  3381   NA  NA
 NA  02163399054  19921   25
 NA  SP0052652   8 25 4   NA
 100 NA  NA  NA  85
 1   NA  NA  NA  02163399057
 19921   26  NA  SP0037940   8
 25 4   NA  100 NA  NA
 NA  70  1   NA  NA
 NA  02163399058
 19921   27  NA  SP0072357   8
 25 4   NA  100 NA  NA
 NA  35  1   NA  NA
 NA  02163399059
 19921   27  NA  SP0026324   8
 25 4   NA  100 NA  NA
 NA  8   1   NA  NA
 NA  02163399060
 19921   28  1 SP0072357   8
 25 4   NA  100 NA  NA
 NA  2001   NA  NA
 NA  02163399062


 Does anyone know how to do this.  Data file is attached.
 Thanks in advance

 Cameron Guenther, Ph.D.
 Associate Research Scientist
 FWC/FWRI, Marine Fisheries Research
 100 8th Avenue S.E.
 St. Petersburg, FL 33701
 (727)896-8626 Ext. 4305
 [EMAIL PROTECTED]

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 

Re: [R] do.call in 2.3.0 vers 2.3.x

2006-05-04 Thread Gabor Grothendieck
See:

https://www.stat.math.ethz.ch/pipermail/r-devel/2006-May/037542.html

On 5/4/06, Dieter Menne [EMAIL PROTECTED] wrote:
 Dear R-Core,

 after switching to 2.3.0, all my trusted do.call constructs that worked in
 2.2 and earlier fail. I noted that changes were introduced to do.call, but I
 could not find out how these relate to my problem.

 The following example works in 2.2 and earlier, but fails because rownames
 are partially NA. I can correct this by manually adding row names, but it's
 a bit of work to check this in all my code.

 Dieter

 --

 wby = by(warpbreaks[, 1:2], warpbreaks$tension,
  function(x) {
data.frame(breaks=mean(x$breaks),var=var(x$breaks))
  }
  )

 cd = do.call(rbind,wby)
 row.names(cd)
 cd

  Output in 2.3.0
  row.names(cd)
 [1] NANA1 NA2
  cd
 Error in data.frame(breaks = c(36.38889, 26.38889, 21.7), var =
 c(270.48693,  :
row names contain missing values
 

 
 platform   i386-pc-mingw32
 arch   i386
 os mingw32
 system i386, mingw32
 status
 major  2
 minor  3.0
 year   2006
 month  04
 day24
 svn rev37909
 language   R
 version.string Version 2.3.0 (2006-04-24)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] combined multiple observations

2006-05-04 Thread Gabor Grothendieck
Assuming DF is your data frame, try this:

   aggregate(DF[,0-1:2], DF[,1:2], sum)


On 5/4/06, YIHSU CHEN [EMAIL PROTECTED] wrote:
 Dear R users:

 I have a data frame as follows, where e1-e3 are indicator variables with 
 value equal 0 or 1.

 St  County  e1 e2 e3
 1   2 1   0   0
 1   2 0   1   0
 2   1 0   0   1
 2   2 1   0   0

 What I would like to do is to combine observations with same pair of ST and 
 County together.  For example, for the St=1 and County=2, I would like to 
 have follows:

 St  County  e1 e2 e3
 1   2 1   1   0

 Since I have a total of more than 3 observations, any blue force way 
 seems to be not efficient.  Does anyone of you have experience to deal with 
 it?

 Thank you so much.


 Yihsu Chen
 The Johns Hopkins University

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


<    5   6   7   8   9   10   11   12   13   14   >