[R] legend with mixed boxes and lines (not both)

2007-05-14 Thread Michael Toews
Hi,
I seem to be unable to get a mixed legend that has lines *or* polygons 
(not both). For example:

ppi - seq(0,2*pi,length.out=21)[-21]
frame()
plot.window(ylim=c(-5,5),xlim=c(-5,5),asp=1)
polygon(cos(ppi)*4+rnorm(20,sd=.2),sin(ppi)*4+rnorm(20,sd=.2),
col=green,border=FALSE)
polygon(cos(ppi)*2+rnorm(20,sd=.1),sin(ppi)*2+rnorm(20,sd=.1),
col=blue,border=FALSE)
abline(0,2,col=red)
legend(topleft,legend=c(out,in,line),bty=n,
fill=c(green,blue,NA),col=c(NA,NA,red),
lwd=c(NA,NA,1))

I'm really guessing the behaviour in the legend() call, by setting fill 
to NA for the item, etc. I also tried fill=c(green,blue,FALSE), but 
that didn't go over too well either. I also tried adding merge=TRUE, 
but that just puts the line into the box. I also tried using 
box.lwd=c(1,1,0), but that also did not work
Is there either a way to do this or a clean workaround? Thanks in advance.
+mt

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sequential for loop

2007-04-19 Thread Michael Toews
Hi all,

I'm usually comfortable using the *apply functions for vectorizing loops 
in R. However, my particular problem now is using it in a sequential 
operation, which uses values evaluated in an offset of the loop vector. 
Here is my example using a for loop approach:

dat - data.frame(year=rep(1970:1980,each=365),yday=1:365)
dat$value - sin(dat$yday*2*pi/365)+rnorm(nrow(dat),sd=0.5)
dat$ca - dat$cb - 0 # consecutive above and below 0

for(n in 2:nrow(dat)){
  if(dat$value[n]  0)
dat$ca[n] - dat$ca[n-1] + 1
  else
dat$cb[n] - dat$cb[n-1] + 1
}

I'm inquiring if there is a straightforward way to vectorize this (or a 
similar example) in R, since it gets rather slow with larger data 
frames. If there is no straightforward method, no worries.

Thanks in advance.
+mt

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] timeDate business day

2007-03-12 Thread Michael Toews
Those numbers look like ... well, numbers. You want characters! Try 
converting the integer to a character before trying to do a string 
parse, e.g.:

ymd.int - c(20050104, 20050105, 20050106, 20050107, 20050110, 20050111, 
20050113, 20050114)
ymd - as.Date(as.character(ymd.int),%Y%m%d)

As far as the other functions you are looking at (timeDate, 
timeRelative) -- I've never seen these, so I'm guessing they are 
S-PLUS. In R, you can use diff or difftime (which works with Date 
and POSIXlt-or Date-Time classes) , e.g.:

diff(ymd)
diff(ymd,2)
diff(ymd,3)

or do some arithmetic:

difftime(ymd[1],ymd[4])
difftime(ymd[1],ymd[4],unit=weeks)

Hopefully this is helpful to you!
+mt

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] timeDate business day

2007-03-12 Thread Michael Toews

 [1] 20050104 20050105 20050106 20050107 20050110 20050111 20050113 
 20050114
 ymd - as.Date(as.character(ymd.int),%Y%m%d)
 ymd
 [1] 2005-01-04 2005-01-05 2005-01-06 2005-01-07 2005-01-10
 [6] 2005-01-11 2005-01-13 2005-01-14
 class(ymd)
 [1] Date

 While the variable ymd is actually of class Date, the format is not 
 mmdd but
 -mm-dd as one can see in the previous example.
 As Young, I do not see what I am missing here.
 Any hint would be appreciated.

 AA.
What happened in the beginning is that I had to parse the character into 
a Date-Time class (Date, in this case as you correctly pointed out). 
POSIX is a kind of standard that (mainly Unix) computers use date 
formatters, such as %Y for a 4-digit year, and others. They are all 
listed in great detail in ?strptime (which means string parse time). 
In this case the input parsing format pattern was %Y%m%d. There were 
no spaces in-between each number.

When that class prints out, the default format is ISO 8601 ( see 
http://en.wikipedia.org/wiki/ISO_8601 ). When R prints the class Date 
to your screen, it decides to format it ISO 8601-style for you. If you 
want to see if differently, you can try:

format(ymd,%Y/%d/%m)

The date is actually stored internally as an ordinal, somewhat like how 
MS Excel dates work. You can see how it works internally:

str(ymd)

Hopefully I've demystified some of this .. any other questions?
+mt

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] timeDate business day

2007-03-12 Thread Michael Toews
Sadly, I don't know of any tutorials or much help on the web for R ... 
that doesn't mean it doesn't exist ... you might just have to look 
around for it (www.rseek.org is a good place to start)
I've learned almost everything I know through:
?strptime

Also check out the methods for the classes, for example:

methods(class=Date)
methods(class=POSIXct)

And certainly check their help pages ... there is loads of stuff here 
that I haven't discovered myself. (Note, if you are new to S3 classes .. 
if it begins with the method, then . class, you only need to type the 
beginning. For example summary(ymd) ... not summary.Date(ymd) if 
ymd has `class(ymd) == Date `.

I think the fundamental things to know are there are three main 
DateTimeClasses:

  1. POSIXct - has date, time and optionally time-zone info -- very
 handy for using in data.frame objects (and frankly I think it
 should be renamed to DateTime since the class POSIXct has
 nothing really to do directly with date/times)
  2. POSIXlt - as far as I'm concerned, this is has the same
 functionality as POSIXct, but it cannot be used in data.frame
 objects (and frankly, I think it should be deprecated in favour of
 #1 to reduce future confusion)
  3. Date - use this if you don't care about times or time-zones

But it would be nice to track down a good tutorial somewhere.
+mt

Young Cho wrote:
 Thanks so Michael! If you know of a tutorial or introductory document 
 about timeDate manipulation or time series manipulation in R, can you 
 share it? It is hard to find by googling... I'd very appreciate any 
 advice.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pdf device bounding box?

2007-03-09 Thread Michael Toews
I apologize if I don't fully understand your question, but the pdf 
device has a MediaBox, which is equivalent to the BoundingBox in EPS 
file. The PDFs from R are defined nicely using height/width dimensions, 
and work well with embedding in pdflatex, etc. For example:

pdf(test.pdf,height=3,width=3)
plot(1:10)

and view the (partially binary) output in your shell:
less -N test.pdf

on line 117 of this file, I see /MediaBox [0 0 216 216] which is a 3in 
by 3in box measured in PostScript points.

I don't understand how you are mixing this in with the epstopdf command. 
If you want to make both a PDF and EPS, my best advice is to do both 
directly from R (see ?postscript for EPS file generation .. the same 
example as above will have %%BoundingBox: 0 0 216 216 on line 10), and 
your output  for both formats should be clean, simple, and good enough 
for publishers and everyone else to use.

Just one caution, if you have a Windows computer and R  2.5.1 (which is 
most of us), make sure you write EPS files before loading up a PDF 
device (PR#9517).

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] curve of density on histogram

2007-03-08 Thread Michael Toews
Your are so close ... you just need to specify that you want your 
histograms to show density, not percent. I've only edited one line of 
your script, the rest is good and plots nicely:

histogram(~ resp | group, col=steelblue,type=density,

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using logarithmic y-axis (density) in a histogram

2007-03-08 Thread Michael Toews
You might also want to try density, since it can theoretically have 
non-zero bins, since it doesn't use bins. For example, take a Weibull 
distribution, which could look better with a log y-axis:

x - rweibull(1000,1,5)
par(mfrow=c(2,1))
plot(density(x,from=0))
rug(x)
plot(density(x,from=0),log=y)
rug(x)

you may need to fiddle with the bw (bandwidth) parameter of density, 
since this controls the smoothness of the kernel (see ?density).
+mt

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot(): I want to display dates on X-axis.

2007-03-07 Thread Michael Toews

 e.g. dat

  [,1]dat[,2]

 [1,]300   20060101
 [2,]257   20060102
 [3,]320   20060103
 [4,]311   20060104
 [5,]297   20060105
 [6,]454   20060106
 [7,]360   20060107
 [8,]307   20060108
 
   
This looks like a matrix ... not a data frame. You defiantly want a data 
frame. So lets say you have:
dat - matrix(c(round(rnorm(8)*100+400),20060101:20060108),ncol=2)

#convert it:
dat - as.data.frame(dat)

#determine the dates:
dat$date - as.Date(as.character(dat$V2),%Y%m%d)

#plot it:
plot(V1 ~ date, dat)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Named backreferences in replacement patterns

2007-03-07 Thread Michael Toews
How about turning them into a native date-time class, then re-formatting it.
For example, say you have some American dates in a character vector:
American.datechar - c(5/15/1976,2/15/1970,1/9/2006)
# parse this:
American.date - strptime(American.datechar,%m/%d/%Y)
# reformat:
format(American.date,%d/%m/%Y)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] alpha parameter in function rgb to specify color

2007-03-07 Thread Michael Toews
Part of the problem is that alpha is a new and undocumented feature 
(note to developers: add this info into the pdf and rgb documentation). 
It only works in PDFs (maybe on quartz on Macs?), so you need to write 
to a PDF file:

pdf(out.pdf,version=1.4)
plot(1:10,col=rgb(1,0,0,alpha=seq(0,1,length.out=10)))
dev.off()

Now open up the PDF file...

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] exact matching of names in attr

2007-02-26 Thread Michael Toews
In R 2.5.0 (r40806), one of the change is to allow partial matching of 
name in the attr function. However, how can I tell if I have an exact 
match or not?

For example, checking to see if an object has a name attribute, then 
giving it one if it doesn't:

dat - data.frame(x=1:10,y=rnorm(10))
if(is.null(attr(dat,name)))
attr(dat,name) - Site 1
str(dat)

(This example works in R  2.5) Although there is no name attribute to 
the data.frame, it partially matches to names, resulting in not 
setting the attribute. (Personally, I think this change in the attr 
function is not desirable, and much prefer exact matches to avoid 
unintentional errors).

How can I tell if this is an exact match? Is there a way to force an 
exact match?

Thanks.

+mt

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] looping

2007-02-26 Thread Michael Toews
Another way is to use an indexed list, which is far more tidier than 
your method. If you mean about 100 as in an irregular number, then a 
list is your friend (i.e., a ragged array, that can have sometimes 97 
samples, sometime 105 samples, etc.). Similar to your example:

dat - runif(10,0,100) # fake dataset
smp - list() # need an empty list first
for(i in 1:1000)
smp[[i]] - sample(dat,100)

However, if you are new to R/S, the best advice is to learn to _not_ use 
the for loop (because it is slow, and there are vectorized ways). For 
example, if we want to find the mean of each sample, then return a tidy 
result:

sapply(samp,mean)

or a crazy new analysis you might be working on:

crazy - function(x,y) (sum(xy)^2)/sum(x)
sapply(smp,crazy,10)

etc.
+mt

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] setting options in a list

2006-10-22 Thread Michael Toews
Hi,
I have a question regarding setting options that use a list. There are 
none that I'm aware of in R-base, however, it can make sense for some 
custom situations. For example:
 options(myoptions=list(lwd=1,density=NULL,angle=45,col=green))
Getting values is simple:
 getOption(myoptions)$col
However, setting a value from the list is less intuitive; here are a few 
failed attempts that could make sense from an intuitive level:
 setOption(myoptions)$col - red  # non-existing function
 options(myoptions)[[1]]$col - red
Here is an attempt that almost works, but as documented, it only sets a 
local .Options object for S-compatibility:
 .Options[myoptions][[1]]$col - red
 .Options[myoptions][[1]]$col   # works, but ...
 getOption(myoptions)$col

The only method I know of is to modify a copy of the list from the 
options, then re-set the option:
 cp - getOption(myoptions)
 cp$col - red
 options(myoptions=cp)
 getOption(myoptions)$col

So, my question is if there is a more elegant method of setting an 
option in a list, that doesn't need copying and multiple commands? 
Having a 'setOption' function sure could be helpful for this instance, 
however I'm unsure how to implement this method (or if it is possible in 
the current version of R).
Thanks.
+mt

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Log-scale in histogramm

2006-10-14 Thread Michael Toews
Log y-axis on histograms are conceptually wrong, but aren't a bad idea 
either. It is conceptually safer to show this using density. Consider 
an exponential distribution, which could look better with a log y-axis:

x - rexp(1,.1)
xd - density(x,from=0)
par(mfrow=c(2,1))
plot(xd)
plot(xd,log=y,ylab=Log Density)

Just use caution in the interpretation. The large dropping/osculating 
features on the right-hand side of the log plot are from sparse 
data-points, and this can be smoothed out by adjusting 'bw' to get a 
theoretical interpretation of the true distribution.
+mt

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Object attributes in R

2006-10-11 Thread Michael Toews
Hi,
I have questions about object attributes, and how they are handled when 
subsetted. My examples will use:
tm - (1:10)/10
ds - (1:10)^2
attr(tm,units) - sec
attr(ds,units) - cm
dat - data.frame(tm=tm,ds=ds)
attr(dat,id) - test1

When a primitive class object (numeric, character, etc.) is subsetted, 
the attributes are often stripped away, but the rules change for more 
complex classes, such as a data.frame, where they 'stick' for the 
data.frame, but attributes from the members are lost:
tm[3:5]# lost
ds[-3] # lost
str(dat[1:3,]) # only kept for data.frame

Is there any way of keeping the attributes when subsetted from primitive 
classes, like a fictional attr.drop option within the [ braces? The 
best alternative I have found is to make a new object, and copy the 
attributes:
tm2 - tm[3:5]
attributes(tm2) - attributes(tm)

However, for the data.frame, how can I copy the attributes over (without 
using a for loop -- I've tried a few things using sapply but no success)?
Also I don't see how this is consistent with an empty index, [], where 
attributes are always retained (as documented):
tm[]

I have other concerns about the evaluation of objects with attributes 
(e.g. ds/tm), where the attributes from the first object are retained 
for the output, but this evaluation of logic is a whole other can of 
worms I'd rather keep closed for now.

+mt

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Documentation patch for 'match' and 'palette'

2006-09-28 Thread Michael Toews
Here is a patch to improve documentation for finding useful, yet newish, 
functions: 'findInterval' and 'colorRamp'. I think that it is worthwhile 
to mention these in the 'seealso' section of the similar 'match' and 
'palette' documents. I had difficulty finding these functions at first, 
as they have compound names. Modify the patch as needed.

+mt

Index: src/library/base/man/match.Rd
===
--- src/library/base/man/match.Rd   (revision 39542)
+++ src/library/base/man/match.Rd   (working copy)
@@ -65,6 +65,8 @@
   \code{\link{pmatch}} and \code{\link{charmatch}} for (\emph{partial})
   string matching, \code{\link{match.arg}}, etc for function argument
   matching.
+  \code{\link{findInterval}} similarly returns a vector of positions, but
+  finds numbers within intervals, rather than exact matches.
 
   \code{\link{is.element}} for an S-compatible equivalent of \code{\%in\%}.
 }
Index: src/library/grDevices/man/palette.Rd
===
--- src/library/grDevices/man/palette.Rd(revision 39542)
+++ src/library/grDevices/man/palette.Rd(working copy)
@@ -31,7 +31,7 @@
   \code{\link{colors}} for the vector of built-in \dQuote{named} colors;
   \code{\link{hsv}}, \code{\link{gray}}, \code{\link{rainbow}},
   \code{\link{terrain.colors}},\dots to construct colors;
-  
+  \code{\link{colorRamp}} to interpolate colors, making custom palettes;
   \code{\link{col2rgb}} for translating colors to RGB 3-vectors.
 }
 \examples{
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.