Re: [R] ReShape chicks example - line plots

2009-07-06 Thread hadley wickham
On Mon, Jul 6, 2009 at 8:22 PM, Mark Knecht wrote: > Hi, >   In the examples from the ReShape package there is a simple example > of using melt followed by cast that produces a smallish amount of > output about the chicks database. Here's the code: > > library(reshape) > > names(ChickWeight) <- tol

Re: [R] NA values trimming

2009-07-06 Thread hadley wickham
On Mon, Jul 6, 2009 at 12:12 AM, nyk wrote: > > Thanks for your reply! This is what I was looking for! > I'm using > nas1 <- apply(data_matrix,1,function(x)sum(is.na(x))/nrow(data_matrix)) > nas2 <- apply(data_matrix,2,function(x)sum(is.na(x))/ncol(data_matrix)) You can simplify this a little: pe

Re: [R] ggplot2: Boxplot with given box size

2009-07-05 Thread hadley wickham
Hi Malcolm, You need to tell geom_boxplot not to use stat_boxplot: geom_boxplot(aes(lower=y_q1, upper=y_q3, middle=y_med, ymin=y_min, ymax=y_max), stat = "identity") Hadley On Mon, Jul 6, 2009 at 6:55 AM, Malcolm Ryan wrote: > Is there anyway in ggplot2 to set the aesthetics for a geom_boxplot >

Re: [R] OK - I got the data - now what? :-)

2009-07-05 Thread hadley wickham
>   I think the root cause of a number of my coding problems in R right > now is my lack of skills in reading and grabbing portions of the data > out of arrays. I'm new at this. (And not a programmer) I need to find > some good examples to read and test on that subject. If I could locate > which co

Re: [R] Skeleton Package to Flesh Out?

2009-07-05 Thread hadley wickham
Also make sure to check roxygen (from roxygen.org) - it makes package documentation much much easier. Ironically, the documentation for roxygen currently leaves something to be desired but I think Peter and Manuel are working on it. Hadley On Sat, Jul 4, 2009 at 3:59 PM, Jason Rupert wrote: > >

Re: [R] What command lists everything in a package?

2009-07-05 Thread hadley wickham
> 2) Related to the above, how do I tell what packages are currently > loaded at any given time so that I don't waste time loading things > that are already loaded? search() tells me what's available, but > what's loaded? The best I can find so far goes like this: Loading something a second time t

Re: [R] help with dealing with integer(0) returns from grep used within a conditional loop

2009-07-04 Thread hadley wickham
On Sat, Jul 4, 2009 at 7:56 PM, Mark Kimpel wrote: > I am using grep to locate colnames to automate a report build and have > run into a problem when a colname is not found. The use of integer(0) > in a conditional statement seems to be a no no as it has length 0. > Below is a self-contained trivia

Re: [R] Passing expression as argument to do.call

2009-07-02 Thread hadley wickham
On Thu, Jul 2, 2009 at 3:34 PM, Sebastien Bihorel wrote: > Dear R-users, > > I would like to know how expressions could be passed as arguments to do.call > functions. As illustrated in the short example below, concatenating lists > objects and an expression creates an expression object, which is no

Re: [R] Learning S3

2009-07-02 Thread Hadley Wickham
On Thu, Jun 18, 2009 at 12:08 PM, Dirk Eddelbuettel wrote: > > On 18 June 2009 at 09:36, Bert Gunter wrote: > | -- or Chapter 4 in S PROGRAMMING? (you'll need to determine if it's "reader > | friendly") > > +1 > > It helped me a lot too back in the day.  But I am wondering if there are good > curre

Re: [R] Select values at random by id value

2009-07-02 Thread hadley wickham
On Thu, Jul 2, 2009 at 8:15 AM, James Martin wrote: > Hadley, Sunil, and list, > > This is not quite doing what I wanted it to do (as far as I can tell). I > perhaps did not explain it thoroughly.  It seems to be sampling one value > for each day leaving ~200 observations. I need for it randomly ch

Re: [R] Select values at random by id value

2009-07-01 Thread hadley wickham
On Wed, Jul 1, 2009 at 2:10 PM, Sunil Suchindran wrote: > #Highlight the text below (without the header) > # read the data in from clipboard > > df <- do.call(data.frame, scan("clipboard", what=list(id=0, > date="",loctype=0 ,haptype=0))) > > # split the data by date, sample 1 observation from each

Re: [R] How do I change which R Graphics Device is active?

2009-06-30 Thread hadley wickham
On Tue, Jun 30, 2009 at 2:12 PM, Barry Rowlingson wrote: > On Tue, Jun 30, 2009 at 8:05 PM, Mark Knecht wrote: > >> You could wrap it in a function of your own making, right? >> >> AddNewDev = function() {dev.new();AddNewDev=dev.cur()} >> >> histPlot=AddNewDev() >> >> Seems to work. > >  You leaRn

Re: [R] ggplot2 x axis question

2009-06-29 Thread hadley wickham
ackage. That code is giving me > the following error: > >> qplot(reorder(model,delta),delta,data=growthm.bic) > Error in UseMethod("reorder") : no applicable method for "reorder" > > Cheers, > Chris > > On 6/28/09 8:21 PM, hadley wickham wrote: > >

Re: [R] ggplot2 x axis question

2009-06-28 Thread hadley wickham
Hi Chris, Try this: qplot(reorder(model, delta), delta, data = growthm.bic) Hadley On Sun, Jun 28, 2009 at 9:53 AM, Christopher Desjardins wrote: > Hi, > I have 45 models that I have named: 1, 2, 3, ... , 45 and I am trying to > plot them in order of ascending BIC values. I am however unclear a

Re: [R] simple loop

2009-06-28 Thread hadley wickham
> Also consider ddply in the plyr package (although that's an over kill if > your only having two loops) Maybe, but it sure is much simpler: library(plyr) ddply(data, c("industry","year"), summarise, avg = mean(X1)) Hadley -- http://had.co.nz/ __ R-

Re: [R] a plot of stacked boxes

2009-06-26 Thread hadley wickham
On Fri, Jun 26, 2009 at 10:27 PM, Osman Al-Radi wrote: > Dear Richard and David, > > Thanks for this reference. I looked into vcd and mosaic plot, it is a nice > plot for investigating associations between two or more variables. However, > I just need to plot the frequency of a single variable as t

Re: [R] Using by() and stacking back sub-data frames to one data frame

2009-06-25 Thread hadley wickham
Have a look at ddply from the plyr package, http://had.co.nz/plyr. It's made for exactly this type of operation. Hadley On Wed, Jun 24, 2009 at 10:34 PM, Stephan Lindner wrote: > Dear all, > > > I have a code where I subset a data frame to match entries within > levels of an factor (actually, the

Re: [R] "by" question

2009-06-24 Thread hadley wickham
You might also want to look at the plyr package, http://had.co.nz/plyr. In particular, ddply + transform makes these tasks very easy. library(plyr) ddply(mtcars, "cyl", transform, pos = seq_along(cyl), mpg_avg = mean(mpg)) Hadley On Wed, Jun 24, 2009 at 11:48 AM, David Hugh-Jones wrote: > That

Re: [R] Apply as.factor (or as.numeric etc) to multiple columns

2009-06-23 Thread hadley wickham
Hi Mark, Have a look at colwise (and numcolwise and catcolwise) in the plyr package. Hadley On Tue, Jun 23, 2009 at 4:23 PM, Mark Na wrote: > Hi R-helpers, > > I have a dataframe with 60columns and I would like to convert several > columns to factor, others to numeric, and yet others to dates. R

[R] [R-pkgs] plyr 0.1.9

2009-06-23 Thread Hadley Wickham
plyr is a set of tools for a common set of problems: you need to break down a big data structure into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to: * fit the same model to subsets of a data frame * quickly calculate summary

Re: [R] Roxygen vs Sweave for S4 documentation

2009-06-21 Thread hadley wickham
> I have been using R for a while.  Recently, I have begun converting my > package into S4 classes.  I was previously using Rdoc for documentation. > Now, I am looking to use the best tool for S4 documentation.  It seems that > the best choices for me are Roxygen and Sweave (I am fine with tex). >

Re: [R] Dataset suggestion sought

2009-06-18 Thread hadley wickham
> In revising my book Regression Modeling Strategies for a second edition, I > am seeking a dataset for exemplifying multiple regression using least > squares.  Ideally the dataset would have 5-40 variables and 40-1 > independent observations, and would generate significant interest for a wide

Re: [R] Learning S3

2009-06-18 Thread Hadley Wickham
ck > Sent: Thursday, June 18, 2009 9:17 AM > To: Hadley Wickham > Cc: r-help > Subject: Re: [R] Learning S3 > > There is a section on Object Orientation in MASS (I have 2nd ed). > > On Thu, Jun 18, 2009 at 12:06 PM, Hadley Wickham wrote: >> Hi all, >> >> Do you k

[R] Learning S3

2009-06-18 Thread Hadley Wickham
Hi all, Do you know of any good resources for learning how S3 works? I've some how become familiar with it by reading many small pieces, but now that I'm teaching it to students I'm wondering if there are any good resources that describe it completely, especially in a reader-friendly way. So far

[R] [OT] VBA to save excel as csv

2009-06-15 Thread Hadley Wickham
Hi all, This is a little off-topic, but it is on the general topic of getting data in R. I'm looking for a excel macro / vba script that will export all spreadsheets in a directory (with one file per tab) into csv. Does anyone have anything like this? Thanks, Hadley -- http://had.co.nz/ ___

[R] Programmatically copying a graphic to the clipboard

2009-06-12 Thread Hadley Wickham
Hi all, Is there a cross-platform way to do this? On the mac, I cando this by saving an eps file, and then using pbcopy. Is it possible on other platforms? Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman

Re: [R] how to substitute missing values (NAs) by the group means

2009-06-08 Thread hadley wickham
On Mon, Jun 8, 2009 at 8:56 PM, Mao Jianfeng wrote: > Dear Ruser's > > I ask for helps on how to substitute missing values (NAs) by mean of the > group it is belonging to. > > my dummy dataframe is: > >> df >       group traits > 1  BSPy01-10     NA > 2  BSPy01-10    7.3 > 3  BSPy01-10    7.3 > 4  

Re: [R] Looking for easy way to normalize data by groups

2009-06-08 Thread hadley wickham
On Mon, Jun 8, 2009 at 10:29 AM, Herbert Jägle wrote: > Hi, > > i do have a dataframe representing data from a repeated experiment. PID is a > subject identifier, Time are timepoints in an experiment which was repeated > twice. For each subject and all three timepoints there are 2 sets of four > va

Re: [R] A very frustrating read.table error message

2009-06-06 Thread hadley wickham
On Sat, Jun 6, 2009 at 5:02 PM, Adam D. I. Kramer wrote: > Dear Colleagues, > >        Occasionally I deal with computer-generated (i.e., websurvey) data > files that haven't quite worked correctly. When I try to read the data into > R, I get something like this: > > Error in scan(file, what, nmax,

Re: [R] OT: Inference for R - Interview

2009-06-04 Thread hadley wickham
Is it really necessary to further advertise this company which already spams R-help subscribers? Hadley On Thu, Jun 4, 2009 at 10:41 PM, Ajay ohri wrote: > Dear All, > > Slightly off -non technical topic ( but hey it is Friday) > > Following last week's interview with REvolution Computing which m

Re: [R] Minor tick marks for date/time ggplot2 (this is better, but not exactly what I want)

2009-06-04 Thread hadley wickham
On Mon, Jun 1, 2009 at 2:18 PM, stephen sefick wrote: > library(ggplot2) > > melt.updn <- (structure(list(date = structure(c(11808, 11869, 11961, 11992, > 12084, 12173, 12265, 12418, 12600, 12631, 12753, 12996, 13057, > 13149, 11808, 11869, 11961, 11992, 12084, 12173, 12265, 12418, > 12600, 12631,

Re: [R] Still can't find missing data - How do I get NA in xtabs with factors?

2009-06-02 Thread hadley wickham
>> Let's see if I understand this.  Do I iterate through >>    x <- factor(x, levels(c(levels(x), NA), exclude=NULL) >> for each of the few hundred variables (x) in my data frame? > > > Yes, for all being factors. Wouldn't addNA() be the preferred method? To do it for all variables is pretty simp

Re: [R] [ANN] ggplot2 + rggobi course. July 30-31, Washington DC

2009-06-02 Thread hadley wickham
. All proceeds go to the GGobi Foundation to support graphics research. Find out more, and book your tickets online at http://lookingatdata.com Regards, Hadley Wickham Dianne Cook __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo

Re: [R] ggplot2 and Date class

2009-06-01 Thread hadley wickham
You might have an out-of-date version of the plyr package - try install.packages("plyr") Hadley On Mon, Jun 1, 2009 at 10:20 AM, Matt Frost wrote: > I'm trying to plot a time series in ggplot, but a date column in my > data frame is causing errors. Rather than provide my own data, I'll > just re

Re: [R] ggplot2: annotating plot with mathematical formulae

2009-05-16 Thread hadley wickham
Hi Paul, Unfortunately that's not something that's currently possible with ggplot2, but I am thinking about how to make it possible. Hadley On Sat, May 16, 2009 at 7:48 AM, Paul Emberson wrote: > Hi Stephen, > > The problem is that the label on the graph doesn't get rendered with a > superscrip

Re: [R] assign unique size of point in xyplot

2009-05-15 Thread hadley wickham
On Thu, May 14, 2009 at 2:14 PM, Garritt Page wrote: > Hello,I am using xyplot to try and create a conditional plot.  Below is a > toy example of the type of data I am working with > > slevel <- rep(rep(c(0.5,0.9), each=2, times=2), times=2) > > tlevel <- rep(rep(c(0.5,0.9), each=4), times=2) > >

Re: [R] memory usage grows too fast

2009-05-14 Thread hadley wickham
On Thu, May 14, 2009 at 6:21 PM, Ping-Hsun Hsieh wrote: > Hi All, > > I have a 1000x100 matrix. > The calculation I would like to do is actually very simple: for each row, > calculate the frequency of a given pattern. For example, a toy dataset is as > follows. > > Col1    Col2    Col3    Co

Re: [R] Function to read a string as the variables as opposed to taking the string name as the variable

2009-05-14 Thread hadley wickham
On Thu, May 14, 2009 at 12:16 PM, Lori Simpson wrote: > I am writing a custom function that uses an R-function from the > reshape package: cast.  However, my question could be applicable to > any R function. > > Normally one writes the arguments directly into a function, e.g.: > > result=cast(tabl

Re: [R] Help with reshape/reShape and indexing

2009-05-13 Thread hadley wickham
> This does it more or less your way: > > ds <- split(df, df$Name) > ds <- lapply(ds, function(x){x$Index <- seq_along(x[,1]); x}) > df2 <- unsplit(ds, df$Name) > tapply(df2$X1, df2[,c("Name", "Index")], function(x) x) > > athough there may exist much easier ways ... Here's one way with the plyr a

Re: [R] ggplot2: recommended workaround for broken legend.position="top"

2009-05-10 Thread hadley wickham
On Sun, May 10, 2009 at 10:32 AM, Zeljko Vrba wrote: > Searching the mail archives I found that using legend.position as in > p.ring.3 + opts(legend.position="top") > > is a known bug.  I tried doing > p.ring.3 + opts(legend.position=c(0.8, 0.2)) > > which works, but the legend background is trans

Re: [R] by-group processing

2009-05-08 Thread hadley wickham
On Wed, May 6, 2009 at 8:12 PM, jim holtman wrote: > Ths should do it: > >> do.call(rbind, lapply(split(x, x$ID), tail, 1)) >         ID Type N > 45900 45900    I 7 > 46550 46550    I 7 > 49270 49270    E 3 Or with plyr: library(plyr) ddply(x, "id", tail, 1) plyr encapsulates the common split-

Re: [R] Houston, TX Users Group

2009-05-06 Thread hadley wickham
Hi Robert, I'm organising one - sign up to the mailing list, http://groups.google.com/group/houston-r. I'm hoping to organise our first meeting this summer. Hadley On Wed, May 6, 2009 at 10:15 AM, Robert Sanford wrote: > I'm looking for a Users Group in or near Houston, TX. > > Many thanks! >

Re: [R] Do you use R for data manipulation?

2009-05-06 Thread hadley wickham
> Take a look at plyr and reshape packages (http://had.co.nz/), I have a hunch > that they would have saved me a lot of headache had I found out about them > earlier :) As the author of these two packages, I'm admittedly biased, but I think R is unparalleled for data preparation, manipulation, and

Re: [R] re shape package - use one cast() instead of many

2009-05-05 Thread hadley wickham
On Tue, May 5, 2009 at 3:55 PM, jwg20 wrote: > > Thanks for your help! I wasn't sure what the margins variable did, but I'm > beginning to understand. I'm almost there, but with my data (and with ff_d) > I tried to margin over two variable names, however it only does one of them. > So with ff_d I

Re: [R] re shape package - use one cast() instead of many

2009-05-05 Thread hadley wickham
On Tue, May 5, 2009 at 3:03 PM, jwg20 wrote: > > I have a data set that I'm trying to melt and cast in a specific way using > the reshape package. (I'll use the ff_d dataset from reshape so I don't have > to post a toy data set here. ) > > Lets say I'm looking for the interaction of treatment with

Re: [R] quick square root axes

2009-05-05 Thread hadley wickham
> If you do write your own, the hardest part will be picking the nice tick > marks.  They should be approximately evenly spaced, but at nice round values > of the original variable:  that's hard to do in general.  R has the pretty() > function for the linear scale, and doesn't do too badly on log a

Re: [R] Overlaying graphs from different datasets with ggplot

2009-05-03 Thread hadley wickham
On Thu, Apr 30, 2009 at 2:03 PM, MUHC-Research wrote: > > Dear R-users, > > I recently began using the ggplot2 package and I am still in the process of > getting used to it. > > My goal would be to plot on the same grid a number of curves derived from > two distinct datasets. The first dataset (ca

Re: [R] Last month on the Revolutions blog

2009-05-02 Thread hadley wickham
Hi David, I think the revolution blog is fantastic and a great service to the R community. Thanks for all your hard work! Hadley On Fri, May 1, 2009 at 4:54 PM, David M Smith wrote: > I write about R every weekday at http://blog.revolution-computing.com > . In case you missed them, here are so

Re: [R] Reccomendation for graphics package

2009-05-01 Thread hadley wickham
On Fri, May 1, 2009 at 2:38 PM, Zeljko Vrba wrote: > On Fri, May 01, 2009 at 01:06:34PM -0500, hadley wickham wrote: >> >> It should be trivial with ggplot2 too, but it's hard to provide >> concrete advice without a concrete problem. >> > Elementary prob

Re: [R] MCO: Timing using model.matrix method

2009-05-01 Thread hadley wickham
> My issue is self-evident:  using this method resulted in a 30 fold > increase in time.  My question is why?  If I time the individual > components separately, nothing is unusual.  My hunch is the > "interaction" between the model.matrix and nsga2 methods. > > Any ideas on how to speed this proces

Re: [R] Reccomendation for graphics package

2009-05-01 Thread hadley wickham
> Is situation anything better with ggplot2?  It seems rather easy to get e.g. > line plots with error bars, provided that one feeds the data to some > modeling/regression function and passes the result over for plotting.. but > what > if I have generated my own error bar data?  This is almost tri

Re: [R] A beginner's question about ggplot

2009-05-01 Thread hadley wickham
On Fri, May 1, 2009 at 12:22 PM, MUHC-Research wrote: > > Dear R-users, > > I would have another question about the ggplot() function in the ggplot2 > package. > > All the examples I've read so far in the documentation make use of a single > neatly formatted data.frame. However, sometimes, one may

Re: [R] gridding values in a data frame

2009-04-30 Thread hadley wickham
It's hard to check without a reproducible example, but the following code should give you a 3d array of lat x long x time: library(reshape) df$lat <- round_any(df$LATITUDE, 5) df$long <- round_any(df$LONGITUDE, 5) df$value <- df$TIME cast(df, lat ~ long ~ time, mean) On Thu, Apr 30, 2009 at 10

Re: [R] Bumps chart in R

2009-04-27 Thread hadley wickham
] > library(ggplot2) > qplot(year,value, data=data,label=countries, geom=c("line","text"), > group=countries, col=countries) > > But I would like to have the text labels show only once - e.g. at 1990 > - and also control the size of the text. In my crude qplot, setting > size=2 e.g. changes not onl

Re: [R] Bumps chart in R

2009-04-26 Thread hadley wickham
In statistics, a bumps chart is more commonly called a parallel coordinates plot. Hadley On Sun, Apr 26, 2009 at 5:45 PM, Andreas Christoffersen wrote: > Hi there, > > I would like to make a 'bumps chart' like the ones described e.g. > here: http://junkcharts.typepad.com/junk_charts/bumps_chart/

Re: [R] eager to learn how to use "sapply", "lapply", ...

2009-04-26 Thread hadley wickham
Have a look at the plyr package and associated documentation - http://had.co.nz/plyr Hadley On Sun, Apr 26, 2009 at 12:42 PM, wrote: > After a year my R programming style is still very "C like". > I am still writing a lot of "for loops" and finding it difficult to recognize > where, in place o

Re: [R] 3 questions regarding matrix copy/shuffle/compares

2009-04-26 Thread hadley wickham
he original were changed; the sort of behavior that > might be seen  in a spreadsheet that had a copy "by reference". > > On Apr 26, 2009, at 11:28 AM, hadley wickham wrote: > >>>> I want to (1) create a deep copy of pop, >>> >>> I have already said *I*

Re: [R] 3 questions regarding matrix copy/shuffle/compares

2009-04-26 Thread hadley wickham
>> I want to (1) create a deep copy of pop, > > I have already said *I* do not know how to create a "deep copy" in R. Creating a deep copy is easy, because all copies are "deep" copies. You need to try very hard to create a reference in R. Hadley -- http://had.co.nz/ _

Re: [R] omit empty cells in crosstab?

2009-04-24 Thread hadley wickham
On Fri, Apr 24, 2009 at 3:12 PM, sjaffe wrote: > > small example: > > a<-c(1.1, 2.1, 9.1) > b<-cut(a,0:10) > c<-data.frame(b,b) > d<-table(c) > dim(d) > ##result: c(10, 10) > > But only 9 of the 100 cells are non-zero. > If there were 10 columns, the table have 10 dimensions each of length 10, so

Re: [R] omit empty cells in crosstab?

2009-04-24 Thread hadley wickham
Hi Steve, The general answer is yes, but the specific will depend on your problem. Could you provide a small reproducible example to illustrate your problem? Hadley On Fri, Apr 24, 2009 at 1:19 PM, sjaffe wrote: > > Perhaps this is a common question but I haven't been able to find the answer.

Re: [R] Generalized 2D list/array/whatever?

2009-04-24 Thread hadley wickham
On Fri, Apr 24, 2009 at 5:50 AM, Duncan Murdoch wrote: > Toby wrote: >> >> I'm trying to figure out how I can get a generalized 2D >> list/array/matrix/whatever >> working.  Seems I can't figure out how to make the variables the right >> type.  I >> always seem to get some sort of error... out of

Re: [R] conditional grouping of variables: ave or tapply or by or???

2009-04-23 Thread hadley wickham
On Thu, Apr 23, 2009 at 5:11 PM, ozan bakis wrote: > Dear R Users, > I have the following data frame: > > v1 <- c(rep(10,3),rep(11,2)) > v2 <- sample(5:10, 5, replace = T) > v3 <- c(0,1,2,0,2) > df <- data.frame(v1,v2,v3) >> df >  v1 v2 v3 > 1 10  9  0 > 2 10  5  1 > 3 10  6  2 > 4 11  7  0 > 5 11

Re: [R] bug when subtracting decimals?

2009-04-21 Thread hadley wickham
> "Have you read the posting guide and the FAQs? If you do not get a reply > within two days, you may want to look at both and think about reformulating > your query. Oh, and while you are at it, look through the archives, a lot of > questions have already been asked and answered before." As I say

[R] [R-pkgs] [ANN] ggplot2 version 0.8.3

2009-04-20 Thread hadley wickham
ggplot2 ggplot2 is a plotting system for R, based on the grammar of graphics, which tries to take the good parts of base and lattice graphics and avoid bad parts. It takes care of many of the fiddly details that make plotting a hassle (l

Re: [R] AICs from lmer different with summary and anova

2009-04-19 Thread hadley wickham
> Am I doing something wrong, here? If not, which are the real AIC and logLik > values for the different models? I don't think it's reasonable to expect that the log-likelihood computed by different functions be should comparable. Are the constant terms included or dropped? Hadley -- http://ha

Re: [R] Create histogram from data matrix

2009-04-18 Thread hadley wickham
On Fri, Apr 17, 2009 at 2:07 PM, Paul Warren Simonin wrote: > Thank you all for your advice. >  I have received some good tips, but it was suggested I write back with a > small simulated data set to better illustrate my needs. So, currently my > data frame looks something like: > > ID (date)  Temp

Re: [R] ColorRamp different from ColorRampPalette

2009-04-17 Thread hadley wickham
Look at the output of pal.cr((0:40)/40) Hadley On Fri, Apr 17, 2009 at 2:42 PM, Etienne B. Racine wrote: > > I try to use ColorRamp as ColorRampPalette (i.e. with the same gradient), but > it seems there is a nuance that I've missed. > > pal.crp<-colorRampPalette( c("blue", "white", "red"), space

Re: [R] numbers loop in R

2009-04-17 Thread hadley wickham
On Fri, Apr 17, 2009 at 12:19 PM, jim holtman wrote: > try this: > >> matrixx<-function(A){ > +     B=matrix(NaN,nrow=(A+1),ncol=4) > +     k <- 1 > +     for (i in 3:A){ > +         for (j in i:A) { > +             B[k,] <- c(NaN, i-2, i-1, j) > +             k <- k + 1 > +         } > +     } >

Re: [R] Create histogram from data matrix

2009-04-17 Thread hadley wickham
On Fri, Apr 17, 2009 at 9:59 AM, Paul Warren Simonin wrote: > Hello! >  Thanks for reading this request for assistance. I have a question regarding > creating a histogram-like figure from data that are not currently in the > correct format for the "hist" command. >  Specifically, my data have been

Re: [R] cast function in package reshape

2009-04-17 Thread hadley wickham
, namef) >      res <- c(res, get(namef)) >    } >    names(res) <- namesf >  } >  return(res) > } > > df <- data.frame(id = 1:50, x = sample(c(NA, 1), 50, T), y = sample(1:2, 50, > T), z = sample(letters[1:2], 50, T)) > >> freq1(df$x) > $freq_1 >

Re: [R] Extending a vector to length n

2009-04-16 Thread hadley wickham
> Levels: a >> > > R. Raubertas > Merck & Co > > >> -Original Message- >> From: r-help-boun...@r-project.org >> [mailto:r-help-boun...@r-project.org] On Behalf Of hadley wickham >> Sent: Wednesday, April 15, 2009 10:55 AM >> To: r-help &g

[R] [R-pkgs] [ANN] plyr version 0.1.7

2009-04-15 Thread hadley wickham
plyr is a set of tools for a common set of problems: you need to break down a big data structure into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to: * fit the same model to subsets of a data frame * quickly calculate summary

[R] Extending a vector to length n

2009-04-15 Thread hadley wickham
In general, how can I increase a vector of length m (< n) to length n by padding it with m - n missing values, without losing attributes? The two approaches I've tried, using length<- and adding missings with c, do not work in general: > a <- as.Date("2008-01-01") > c(a, NA) [1] "2008-01-01" NA >

Re: [R] Concatenation, was Re: Physical Units in Calculations

2009-04-13 Thread hadley wickham
On Mon, Apr 13, 2009 at 4:15 AM, Peter Dalgaard wrote: > Stavros Macrakis wrote: > >> It would of course be nice if the existing difftime class could be fit >> into this, as it is currently pretty much a second-class citizen.  For >> example, c of two time differences is currently a numeric vector

Re: [R] Help with postscript (huge file size)

2009-04-12 Thread hadley wickham
> I'm generating some images in R to put into a document that I'm producing > using Latex. This document in Latex is following a predefined model, which > does not accept compilation with pdflatex, so I have to compile with latex > -> dvi -> pdf. Because of that, I have to generate the images in R

Re: [R] Display a very low p-value

2009-04-08 Thread hadley wickham
>  pnorm(37:39,lower.tail=FALSE) > [1] 5.725571e-300  0.00e+00  0.00e+00 > >  This is just a limitation of double precision floating-point arithmetic > ... > >  curve(pnorm(x,lower.tail=FALSE),from=30,to=40,log="y") > .Machine$double.xmin But note curve(pnorm(x,lower.tail=FALSE, log=T),fr

Re: [R] newbie query: simple crosstabs

2009-04-07 Thread hadley wickham
On Tue, Apr 7, 2009 at 4:41 PM, Jorge Ivan Velez wrote: > Hi Eik, > You're absolutely right. My bad. > > Here is the correction of the code I sent: > > apply(mydata[,-1], 2, tapply, mydata[,1], function(x) sum(x)/length(x)) Or more simply: apply(mydata[,-1], 2, tapply, mydata[,1], mean) Hadley

Re: [R] Using as.formula() with the reshape package cast

2009-04-07 Thread hadley wickham
On Tue, Apr 7, 2009 at 8:44 AM, wrote: > > I am trying to use the "cast" function from the reshape package, where the > formula is not passed in directly, but as the result of the as.formula() > function. > > Using reshape v. 0.7.2 > > I am able to properly melt() by data with: > >> molten <- mel

Re: [R] change inter-line spacing in grid graphics - how to?

2009-04-07 Thread hadley wickham
Have a look at ?gpar - it will tell you about lineheight. Hadley On Tue, Apr 7, 2009 at 3:28 AM, Mark Heckmann wrote: > I am trying to change the inter-line spacing in grid.text(), but I just > don't find how to do it. > > pushViewport(viewport()) > grid.text("The inter-line spacing\n is too big

Re: [R] SUM,COUNT,AVG

2009-04-06 Thread hadley wickham
On Mon, Apr 6, 2009 at 5:31 PM, Jun Shen wrote: > This is a good example to compare different approaches. My understanding is > > aggregate() can apply one function to multiple columns > summarize() can apply multiple functions to one column > I am not sure if ddply() can actually apply multiple f

Re: [R] Collapse data matrix with extra info separated by commas

2009-04-06 Thread hadley wickham
On Mon, Apr 6, 2009 at 10:40 AM, baptiste auguie wrote: > Here's one attempt with plyr, hopefully Hadley will give you a better > solution ( I could not get cast() to do it either) > > test <- > data.frame(a=c("A","A","A","A","B","B","B"),b=c(1,1,2,2,1,1,1),c=sample(1:7)) > ddply(test,.(a,b),.fun=

Re: [R] SUM,COUNT,AVG

2009-04-06 Thread hadley wickham
On Mon, Apr 6, 2009 at 9:34 AM, Stavros Macrakis wrote: > There are various ways to do this in R. > > # sample data > dd <- data.frame(a=1:10,b=sample(3,10,replace=T),c=sample(3,10,replace=T)) > > Using the standard built-in functions, you can use: > > *** aggregate *** > > aggregate(dd,list(b=dd$

Re: [R] Best way to turn a list into a data.frame

2009-04-06 Thread hadley wickham
On Mon, Apr 6, 2009 at 8:49 AM, Daniel Brewer wrote: > Hello, > > What is the best way to turn a list into a data.frame? > > I have a list with something like: > $`3845` >  [1] "04010" "04012" "04360" > > $`1029` > [1] "04110" "04115" > > And I would like to get a data frame like the following: >

Re: [R] package: maps and spatstat question

2009-04-06 Thread hadley wickham
Hi Laura, You might find the map_data function from the ggplot2 package helpful: library(ggplot2) library(maps) head(map_data("state", "iowa")) It formats the output of the map command into a self-documenting data frame. Hadley On Mon, Apr 6, 2009 at 7:00 AM, Laura Chihara wrote: > > I would

Re: [R] data.frame, converting row data to columns

2009-04-04 Thread hadley wickham
On Sat, Apr 4, 2009 at 12:28 PM, jim holtman wrote: > Does this do what you want: > >> x <- read.table(textConnection("name         wrist nLevel            emot > + 1                    4094          3.34                    1   frustrated > + 2                    4094          3.94                

Re: [R] data.frame, converting row data to columns

2009-04-04 Thread hadley wickham
On Sat, Apr 4, 2009 at 12:09 PM, ds wrote: > > I have a data frame something like: >                      name         wrist > nLevel            emot > 1                    4094          3.34                    1 > frustrated > 2                    4094          3.94                    1 > frustra

Re: [R] data.frame to array?

2009-04-03 Thread hadley wickham
On Fri, Apr 3, 2009 at 1:45 PM, wrote: > I have a list of data.frames > >> str(bins) > > List of 19217 >  $ 100026:'data.frame': 1 obs. of  6 variables: >  ..$ Sku  : chr "100026" >  ..$ Bin  : chr "T149C" >  ..$ Count: int 108 >  ..$ X    : int 20 >  ..$ Y    : int 149 >  ..$ Z    : chr "3" >  $

Re: [R] plyr and table question

2009-04-03 Thread hadley wickham
On Fri, Apr 3, 2009 at 8:43 AM, baptiste auguie wrote: > That makes sense, so I can do something like, > > count <- function(x){ >        as.integer(unclass(table(x))) > } > > count(d$user_id) > > ddply(d, .(user_id), transform, count = count(user_id)) > >>  user_id  website time count >> 1      2

Re: [R] plyr and table question

2009-04-03 Thread hadley wickham
On Fri, Apr 3, 2009 at 4:43 AM, baptiste auguie wrote: > Dear all, > > I'm puzzled by the following example inspired by a recent question on > R-help, > > > cc <- textConnection("user_id  website          time > 20        google            0930 > 21        yahoo            0935 > 20        faceboo

Re: [R] Selecting all rows of factors which have at least one positive value?

2009-04-02 Thread hadley wickham
>   X1 X2 > 1  11  0 > 2  11  0 > 3  11  0 > 4  11  1 > 5  12  0 > 6  12  0 > 7  12  0 > 8  13  0 > 9  13  1 > 10 13  1 > > > and I want to select all rows pertaining to factor levels of X1 for > which exists at least one "1" for X2. To be clear, I want rows 1:4 > (since there exists at least one o

Re: [R] Deleting rows based on identity variable

2009-04-02 Thread hadley wickham
On Thu, Apr 2, 2009 at 3:37 PM, Rowe, Brian Lee Yung (Portfolio Analytics) wrote: > Is this what you want: >> d1[which(id != 4),] Or just d1[id != 4, ] Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/li

Re: [R] Public R servers?

2009-04-01 Thread hadley wickham
> Earlier I posted a question about memory usage, and the community's input was > very helpful.  However, I'm now extending my dataset (which I use when > running a regression using lm).  As a result, I am continuing to run into > problems with memory usage, and I believe I need to shift to impl

Re: [R] Calculating First Occurance by a factor

2009-04-01 Thread hadley wickham
On Wed, Apr 1, 2009 at 11:00 AM, hadley wickham wrote: >> I tried messing with the line df$FixTime[which.min(df$FixInx)] changing it >> to df[which.min(df$FixInx)] or adding new lines with the additional columns >> that I want to include, but nothing seemed to work. I'

Re: [R] Calculating First Occurance by a factor

2009-04-01 Thread hadley wickham
> I tried messing with the line df$FixTime[which.min(df$FixInx)] changing it > to df[which.min(df$FixInx)] or adding new lines with the additional columns > that I want to include, but nothing seemed to work. I'll admit I only have a > mild understanding of what is going on with the function .fun.

Re: [R] ggplot: order of numeric factor levels?

2009-03-31 Thread hadley wickham
On Tue, Mar 31, 2009 at 5:01 PM, Marianne Promberger wrote: > Hi, > > I'm having problems with qplot and the order of numeric factor levels. > > Factors with numeric levels show up in the order in which they appear > in the data, not in the order of the levels (as far as I understand > factors!) >

Re: [R] Reshape: 'melt' numerous objects

2009-03-31 Thread hadley wickham
On Tue, Mar 31, 2009 at 11:12 AM, Steve Murray wrote: > > Dear R Users, > > I'm trying to use the reshape package to 'melt' my gridded data into column > format. I've done this before on individual files, but this time I'm trying > to do it on a directory of files (with variable file names) - th

Re: [R] Using apply to get group means

2009-03-31 Thread hadley wickham
On Tue, Mar 31, 2009 at 11:31 AM, baptiste auguie wrote: > Not exactly the output you asked for, but perhaps you can consider, > > library(doBy) >> summaryBy(x3~x2+x1,data=x,FUN=mean) >> >>  x2 x1 x3.mean >> 1  1  A     1.5 >> 2  1  B     2.0 >> 3  1  C     3.5 >> 4  2  A     4.0 >> 5  2  B     5.

[R] Bug in col2rgb?

2009-03-31 Thread hadley wickham
> col2rgb("#0079", TRUE) [,1] red 0 green0 blue 0 alpha 121 > col2rgb("#0080", TRUE) [,1] red255 green 255 blue 255 alpha0 > col2rgb("#0081", TRUE) [,1] red 0 green0 blue 0 alpha 129 Any ideas? Thanks, Hadley -- http://had.co

Re: [R] Calculating First Occurance by a factor

2009-03-30 Thread hadley wickham
On Mon, Mar 30, 2009 at 2:58 PM, Mike Lawrence wrote: > I discovered Hadley Wickham's "plyr" package last week and have found > it very useful in circumstances like this: > > library(plyr) > > firstfixtime = ddply( >       .data = data >       , .variables = c('Sub','Tr','IA') >       , .fun <- fu

Re: [R] how to input multiple .txt files

2009-03-30 Thread hadley wickham
On Mon, Mar 30, 2009 at 10:33 AM, Mike Lawrence wrote: > To repent for my sins, I'll also suggest that Hadley Wickham's "plyr" > package (http://had.co.nz/plyr/) is also useful/parsimonious in this > context: > > a <- ldply(cust1_files,read.table) You might also want to do names(cust1_files) <-

<    3   4   5   6   7   8   9   10   11   12   >