Re: [R] Select values at random by id value

2009-07-02 Thread hadley wickham
On Thu, Jul 2, 2009 at 8:15 AM, James Martinjust.strut...@gmail.com wrote: Hadley, Sunil, and list, This is not quite doing what I wanted it to do (as far as I can tell). I perhaps did not explain it thoroughly.  It seems to be sampling one value for each day leaving ~200 observations. I need

Re: [R] Learning S3

2009-07-02 Thread Hadley Wickham
On Thu, Jun 18, 2009 at 12:08 PM, Dirk Eddelbuettele...@debian.org wrote: On 18 June 2009 at 09:36, Bert Gunter wrote: | -- or Chapter 4 in S PROGRAMMING? (you'll need to determine if it's reader | friendly) +1 It helped me a lot too back in the day.  But I am wondering if there are good

Re: [R] Passing expression as argument to do.call

2009-07-02 Thread hadley wickham
On Thu, Jul 2, 2009 at 3:34 PM, Sebastien Bihorelsebastien.biho...@cognigencorp.com wrote: Dear R-users, I would like to know how expressions could be passed as arguments to do.call functions. As illustrated in the short example below, concatenating lists objects and an expression creates an

Re: [R] Select values at random by id value

2009-07-01 Thread hadley wickham
On Wed, Jul 1, 2009 at 2:10 PM, Sunil Suchindransunilsuchind...@gmail.com wrote: #Highlight the text below (without the header) # read the data in from clipboard df - do.call(data.frame, scan(clipboard, what=list(id=0, date=,loctype=0 ,haptype=0))) # split the data by date, sample 1

Re: [R] How do I change which R Graphics Device is active?

2009-06-30 Thread hadley wickham
On Tue, Jun 30, 2009 at 2:12 PM, Barry Rowlingsonb.rowling...@lancaster.ac.uk wrote: On Tue, Jun 30, 2009 at 8:05 PM, Mark Knechtmarkkne...@gmail.com wrote: You could wrap it in a function of your own making, right? AddNewDev = function() {dev.new();AddNewDev=dev.cur()} histPlot=AddNewDev()

Re: [R] ggplot2 x axis question

2009-06-29 Thread hadley wickham
graphing package. That code is giving me the following error: qplot(reorder(model,delta),delta,data=growthm.bic) Error in UseMethod(reorder) : no applicable method for reorder Cheers, Chris On 6/28/09 8:21 PM, hadley wickham wrote: Hi Chris, Try this: qplot(reorder(model, delta), delta

Re: [R] simple loop

2009-06-28 Thread hadley wickham
Also consider ddply in the plyr package (although that's an over kill if your only having two loops) Maybe, but it sure is much simpler: library(plyr) ddply(data, c(industry,year), summarise, avg = mean(X1)) Hadley -- http://had.co.nz/ __

Re: [R] ggplot2 x axis question

2009-06-28 Thread hadley wickham
Hi Chris, Try this: qplot(reorder(model, delta), delta, data = growthm.bic) Hadley On Sun, Jun 28, 2009 at 9:53 AM, Christopher Desjardinscddesjard...@gmail.com wrote: Hi, I have 45 models that I have named: 1, 2, 3, ... , 45 and I am trying to plot them in order of ascending BIC values. I

Re: [R] a plot of stacked boxes

2009-06-26 Thread hadley wickham
On Fri, Jun 26, 2009 at 10:27 PM, Osman Al-Radiosman.al.r...@gmail.com wrote: Dear Richard and David, Thanks for this reference. I looked into vcd and mosaic plot, it is a nice plot for investigating associations between two or more variables. However, I just need to plot the frequency of a

Re: [R] Using by() and stacking back sub-data frames to one data frame

2009-06-25 Thread hadley wickham
Have a look at ddply from the plyr package, http://had.co.nz/plyr. It's made for exactly this type of operation. Hadley On Wed, Jun 24, 2009 at 10:34 PM, Stephan Lindnerlindn...@umich.edu wrote: Dear all, I have a code where I subset a data frame to match entries within levels of an factor

Re: [R] by question

2009-06-24 Thread hadley wickham
You might also want to look at the plyr package, http://had.co.nz/plyr. In particular, ddply + transform makes these tasks very easy. library(plyr) ddply(mtcars, cyl, transform, pos = seq_along(cyl), mpg_avg = mean(mpg)) Hadley On Wed, Jun 24, 2009 at 11:48 AM, David

[R] [R-pkgs] plyr 0.1.9

2009-06-23 Thread Hadley Wickham
plyr is a set of tools for a common set of problems: you need to break down a big data structure into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to: * fit the same model to subsets of a data frame * quickly calculate

Re: [R] Apply as.factor (or as.numeric etc) to multiple columns

2009-06-23 Thread hadley wickham
Hi Mark, Have a look at colwise (and numcolwise and catcolwise) in the plyr package. Hadley On Tue, Jun 23, 2009 at 4:23 PM, Mark Namtb...@gmail.com wrote: Hi R-helpers, I have a dataframe with 60columns and I would like to convert several columns to factor, others to numeric, and yet

Re: [R] Roxygen vs Sweave for S4 documentation

2009-06-21 Thread hadley wickham
I have been using R for a while.  Recently, I have begun converting my package into S4 classes.  I was previously using Rdoc for documentation. Now, I am looking to use the best tool for S4 documentation.  It seems that the best choices for me are Roxygen and Sweave (I am fine with tex). Are

[R] Learning S3

2009-06-18 Thread Hadley Wickham
Hi all, Do you know of any good resources for learning how S3 works? I've some how become familiar with it by reading many small pieces, but now that I'm teaching it to students I'm wondering if there are any good resources that describe it completely, especially in a reader-friendly way. So

Re: [R] Learning S3

2009-06-18 Thread Hadley Wickham
To: Hadley Wickham Cc: r-help Subject: Re: [R] Learning S3 There is a section on Object Orientation in MASS (I have 2nd ed). On Thu, Jun 18, 2009 at 12:06 PM, Hadley Wickhamhad...@rice.edu wrote: Hi all, Do you know of any good resources for learning how S3 works?  I've some how become familiar

Re: [R] Dataset suggestion sought

2009-06-18 Thread hadley wickham
In revising my book Regression Modeling Strategies for a second edition, I am seeking a dataset for exemplifying multiple regression using least squares.  Ideally the dataset would have 5-40 variables and 40-1 independent observations, and would generate significant interest for a wide

[R] [OT] VBA to save excel as csv

2009-06-15 Thread Hadley Wickham
Hi all, This is a little off-topic, but it is on the general topic of getting data in R. I'm looking for a excel macro / vba script that will export all spreadsheets in a directory (with one file per tab) into csv. Does anyone have anything like this? Thanks, Hadley -- http://had.co.nz/

[R] Programmatically copying a graphic to the clipboard

2009-06-12 Thread Hadley Wickham
Hi all, Is there a cross-platform way to do this? On the mac, I cando this by saving an eps file, and then using pbcopy. Is it possible on other platforms? Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list

Re: [R] Looking for easy way to normalize data by groups

2009-06-08 Thread hadley wickham
On Mon, Jun 8, 2009 at 10:29 AM, Herbert Jägleherbert.jae...@uni-tuebingen.de wrote: Hi, i do have a dataframe representing data from a repeated experiment. PID is a subject identifier, Time are timepoints in an experiment which was repeated twice. For each subject and all three timepoints

Re: [R] how to substitute missing values (NAs) by the group means

2009-06-08 Thread hadley wickham
On Mon, Jun 8, 2009 at 8:56 PM, Mao Jianfengjianfeng@gmail.com wrote: Dear Ruser's I ask for helps on how to substitute missing values (NAs) by mean of the group it is belonging to. my dummy dataframe is: df       group traits 1  BSPy01-10     NA 2  BSPy01-10    7.3 3  BSPy01-10    

Re: [R] A very frustrating read.table error message

2009-06-06 Thread hadley wickham
On Sat, Jun 6, 2009 at 5:02 PM, Adam D. I. Kramera...@ilovebacon.org wrote: Dear Colleagues,        Occasionally I deal with computer-generated (i.e., websurvey) data files that haven't quite worked correctly. When I try to read the data into R, I get something like this: Error in

Re: [R] Minor tick marks for date/time ggplot2 (this is better, but not exactly what I want)

2009-06-04 Thread hadley wickham
On Mon, Jun 1, 2009 at 2:18 PM, stephen sefick ssef...@gmail.com wrote: library(ggplot2) melt.updn - (structure(list(date = structure(c(11808, 11869, 11961, 11992, 12084, 12173, 12265, 12418, 12600, 12631, 12753, 12996, 13057, 13149, 11808, 11869, 11961, 11992, 12084, 12173, 12265, 12418,

Re: [R] OT: Inference for R - Interview

2009-06-04 Thread hadley wickham
Is it really necessary to further advertise this company which already spams R-help subscribers? Hadley On Thu, Jun 4, 2009 at 10:41 PM, Ajay ohri ohri2...@gmail.com wrote: Dear All, Slightly off -non technical topic ( but hey it is Friday) Following last week's interview with REvolution

Re: [R] ggplot2 and Date class

2009-06-01 Thread hadley wickham
You might have an out-of-date version of the plyr package - try install.packages(plyr) Hadley On Mon, Jun 1, 2009 at 10:20 AM, Matt Frost mwfr...@gmail.com wrote: I'm trying to plot a time series in ggplot, but a date column in my data frame is causing errors. Rather than provide my own data,

Re: [R] ggplot2: annotating plot with mathematical formulae

2009-05-16 Thread hadley wickham
Hi Paul, Unfortunately that's not something that's currently possible with ggplot2, but I am thinking about how to make it possible. Hadley On Sat, May 16, 2009 at 7:48 AM, Paul Emberson em...@calidasoft.co.uk wrote: Hi Stephen, The problem is that the label on the graph doesn't get rendered

Re: [R] assign unique size of point in xyplot

2009-05-15 Thread hadley wickham
On Thu, May 14, 2009 at 2:14 PM, Garritt Page page2...@gmail.com wrote: Hello,I am using xyplot to try and create a conditional plot.  Below is a toy example of the type of data I am working with slevel - rep(rep(c(0.5,0.9), each=2, times=2), times=2) tlevel - rep(rep(c(0.5,0.9), each=4),

Re: [R] Function to read a string as the variables as opposed to taking the string name as the variable

2009-05-14 Thread hadley wickham
On Thu, May 14, 2009 at 12:16 PM, Lori Simpson lori.simp...@dc-energy.com wrote: I am writing a custom function that uses an R-function from the reshape package: cast.  However, my question could be applicable to any R function. Normally one writes the arguments directly into a function,

Re: [R] memory usage grows too fast

2009-05-14 Thread hadley wickham
On Thu, May 14, 2009 at 6:21 PM, Ping-Hsun Hsieh hsi...@ohsu.edu wrote: Hi All, I have a 1000x100 matrix. The calculation I would like to do is actually very simple: for each row, calculate the frequency of a given pattern. For example, a toy dataset is as follows. Col1    Col2    

Re: [R] Help with reshape/reShape and indexing

2009-05-13 Thread hadley wickham
This does it more or less your way: ds - split(df, df$Name) ds - lapply(ds, function(x){x$Index - seq_along(x[,1]); x}) df2 - unsplit(ds, df$Name) tapply(df2$X1, df2[,c(Name, Index)], function(x) x) athough there may exist much easier ways ... Here's one way with the plyr and reshape

Re: [R] ggplot2: recommended workaround for broken legend.position=top

2009-05-10 Thread hadley wickham
On Sun, May 10, 2009 at 10:32 AM, Zeljko Vrba zv...@ifi.uio.no wrote: Searching the mail archives I found that using legend.position as in p.ring.3 + opts(legend.position=top) is a known bug.  I tried doing p.ring.3 + opts(legend.position=c(0.8, 0.2)) which works, but the legend background

Re: [R] by-group processing

2009-05-08 Thread hadley wickham
On Wed, May 6, 2009 at 8:12 PM, jim holtman jholt...@gmail.com wrote: Ths should do it: do.call(rbind, lapply(split(x, x$ID), tail, 1))         ID Type N 45900 45900    I 7 46550 46550    I 7 49270 49270    E 3 Or with plyr: library(plyr) ddply(x, id, tail, 1) plyr encapsulates the

Re: [R] Do you use R for data manipulation?

2009-05-06 Thread hadley wickham
Take a look at plyr and reshape packages (http://had.co.nz/), I have a hunch that they would have saved me a lot of headache had I found out about them earlier :) As the author of these two packages, I'm admittedly biased, but I think R is unparalleled for data preparation, manipulation, and

Re: [R] Houston, TX Users Group

2009-05-06 Thread hadley wickham
Hi Robert, I'm organising one - sign up to the mailing list, http://groups.google.com/group/houston-r. I'm hoping to organise our first meeting this summer. Hadley On Wed, May 6, 2009 at 10:15 AM, Robert Sanford wob...@gmail.com wrote: I'm looking for a Users Group in or near Houston, TX.

Re: [R] quick square root axes

2009-05-05 Thread hadley wickham
If you do write your own, the hardest part will be picking the nice tick marks.  They should be approximately evenly spaced, but at nice round values of the original variable:  that's hard to do in general.  R has the pretty() function for the linear scale, and doesn't do too badly on log

Re: [R] Overlaying graphs from different datasets with ggplot

2009-05-03 Thread hadley wickham
On Thu, Apr 30, 2009 at 2:03 PM, MUHC-Research villa...@dms.umontreal.ca wrote: Dear R-users, I recently began using the ggplot2 package and I am still in the process of getting used to it. My goal would be to plot on the same grid a number of curves derived from two distinct datasets. The

Re: [R] Last month on the Revolutions blog

2009-05-02 Thread hadley wickham
Hi David, I think the revolution blog is fantastic and a great service to the R community. Thanks for all your hard work! Hadley On Fri, May 1, 2009 at 4:54 PM, David M Smith da...@revolution-computing.com wrote: I write about R every weekday at http://blog.revolution-computing.com . In case

Re: [R] A beginner's question about ggplot

2009-05-01 Thread hadley wickham
On Fri, May 1, 2009 at 12:22 PM, MUHC-Research villa...@dms.umontreal.ca wrote: Dear R-users, I would have another question about the ggplot() function in the ggplot2 package. All the examples I've read so far in the documentation make use of a single neatly formatted data.frame. However,

Re: [R] Reccomendation for graphics package

2009-05-01 Thread hadley wickham
Is situation anything better with ggplot2?  It seems rather easy to get e.g. line plots with error bars, provided that one feeds the data to some modeling/regression function and passes the result over for plotting.. but what if I have generated my own error bar data?  This is almost trivial

Re: [R] MCO: Timing using model.matrix method

2009-05-01 Thread hadley wickham
My issue is self-evident:  using this method resulted in a 30 fold increase in time.  My question is why?  If I time the individual components separately, nothing is unusual.  My hunch is the interaction between the model.matrix and nsga2 methods. Any ideas on how to speed this process up,

Re: [R] Reccomendation for graphics package

2009-05-01 Thread hadley wickham
On Fri, May 1, 2009 at 2:38 PM, Zeljko Vrba zv...@ifi.uio.no wrote: On Fri, May 01, 2009 at 01:06:34PM -0500, hadley wickham wrote: It should be trivial with ggplot2 too, but it's hard to provide concrete advice without a concrete problem. Elementary problem: qplot(wg, v.realtime, data

Re: [R] gridding values in a data frame

2009-04-30 Thread hadley wickham
It's hard to check without a reproducible example, but the following code should give you a 3d array of lat x long x time: library(reshape) df$lat - round_any(df$LATITUDE, 5) df$long - round_any(df$LONGITUDE, 5) df$value - df$TIME cast(df, lat ~ long ~ time, mean) On Thu, Apr 30, 2009 at

Re: [R] Bumps chart in R

2009-04-27 Thread hadley wickham
] library(ggplot2) qplot(year,value, data=data,label=countries, geom=c(line,text), group=countries, col=countries) But I would like to have the text labels show only once - e.g. at 1990 - and also control the size of the text. In my crude qplot, setting size=2 e.g. changes not only the

Re: [R] 3 questions regarding matrix copy/shuffle/compares

2009-04-26 Thread hadley wickham
I want to (1) create a deep copy of pop, I have already said *I* do not know how to create a deep copy in R. Creating a deep copy is easy, because all copies are deep copies. You need to try very hard to create a reference in R. Hadley -- http://had.co.nz/

Re: [R] 3 questions regarding matrix copy/shuffle/compares

2009-04-26 Thread hadley wickham
in the original were changed; the sort of behavior that might be seen  in a spreadsheet that had a copy by reference. On Apr 26, 2009, at 11:28 AM, hadley wickham wrote: I want to (1) create a deep copy of pop, I have already said *I* do not know how to create a deep copy in R. Creating a deep

Re: [R] eager to learn how to use sapply, lapply, ...

2009-04-26 Thread hadley wickham
Have a look at the plyr package and associated documentation - http://had.co.nz/plyr Hadley On Sun, Apr 26, 2009 at 12:42 PM, mau...@alice.it wrote: After a year my R programming style is still very C like. I am still writing a lot of for loops and finding it difficult to recognize where,

Re: [R] Bumps chart in R

2009-04-26 Thread hadley wickham
In statistics, a bumps chart is more commonly called a parallel coordinates plot. Hadley On Sun, Apr 26, 2009 at 5:45 PM, Andreas Christoffersen achristoffer...@gmail.com wrote: Hi there, I would like to make a 'bumps chart' like the ones described e.g. here:

Re: [R] Generalized 2D list/array/whatever?

2009-04-24 Thread hadley wickham
On Fri, Apr 24, 2009 at 5:50 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: Toby wrote: I'm trying to figure out how I can get a generalized 2D list/array/matrix/whatever working.  Seems I can't figure out how to make the variables the right type.  I always seem to get some sort of error...

Re: [R] omit empty cells in crosstab?

2009-04-24 Thread hadley wickham
Hi Steve, The general answer is yes, but the specific will depend on your problem. Could you provide a small reproducible example to illustrate your problem? Hadley On Fri, Apr 24, 2009 at 1:19 PM, sjaffe sja...@riskspan.com wrote: Perhaps this is a common question but I haven't been able to

Re: [R] omit empty cells in crosstab?

2009-04-24 Thread hadley wickham
On Fri, Apr 24, 2009 at 3:12 PM, sjaffe sja...@riskspan.com wrote: small example: a-c(1.1, 2.1, 9.1) b-cut(a,0:10) c-data.frame(b,b) d-table(c) dim(d) ##result: c(10, 10) But only 9 of the 100 cells are non-zero. If there were 10 columns, the table have 10 dimensions each of length 10,

Re: [R] conditional grouping of variables: ave or tapply or by or???

2009-04-23 Thread hadley wickham
On Thu, Apr 23, 2009 at 5:11 PM, ozan bakis ozanba...@gmail.com wrote: Dear R Users, I have the following data frame: v1 - c(rep(10,3),rep(11,2)) v2 - sample(5:10, 5, replace = T) v3 - c(0,1,2,0,2) df - data.frame(v1,v2,v3) df  v1 v2 v3 1 10  9  0 2 10  5  1 3 10  6  2 4 11  7  0 5

Re: [R] bug when subtracting decimals?

2009-04-21 Thread hadley wickham
Have you read the posting guide and the FAQs? If you do not get a reply within two days, you may want to look at both and think about reformulating your query. Oh, and while you are at it, look through the archives, a lot of questions have already been asked and answered before. As I say

[R] [R-pkgs] [ANN] ggplot2 version 0.8.3

2009-04-20 Thread hadley wickham
ggplot2 ggplot2 is a plotting system for R, based on the grammar of graphics, which tries to take the good parts of base and lattice graphics and avoid bad parts. It takes care of many of the fiddly details that make plotting a hassle

Re: [R] AICs from lmer different with summary and anova

2009-04-19 Thread hadley wickham
Am I doing something wrong, here? If not, which are the real AIC and logLik values for the different models? I don't think it's reasonable to expect that the log-likelihood computed by different functions be should comparable. Are the constant terms included or dropped? Hadley --

Re: [R] Create histogram from data matrix

2009-04-18 Thread hadley wickham
On Fri, Apr 17, 2009 at 2:07 PM, Paul Warren Simonin paul.simo...@uvm.edu wrote: Thank you all for your advice.  I have received some good tips, but it was suggested I write back with a small simulated data set to better illustrate my needs. So, currently my data frame looks something like:

Re: [R] cast function in package reshape

2009-04-17 Thread hadley wickham
) {    sum(x[!is.na(x)] == 2) } environment: 0x03d6c930 $freq_1 function (x) {    sum(x[!is.na(x)] == 1) } environment: 0x03d6c930 I would like to use this list of functions with cast function (in package reshape by Hadley Wickham) : cast(melt(df, id = c(id, z), measure = c(x, y

Re: [R] Create histogram from data matrix

2009-04-17 Thread hadley wickham
On Fri, Apr 17, 2009 at 9:59 AM, Paul Warren Simonin paul.simo...@uvm.edu wrote: Hello!  Thanks for reading this request for assistance. I have a question regarding creating a histogram-like figure from data that are not currently in the correct format for the hist command.  Specifically, my

Re: [R] numbers loop in R

2009-04-17 Thread hadley wickham
On Fri, Apr 17, 2009 at 12:19 PM, jim holtman jholt...@gmail.com wrote: try this: matrixx-function(A){ +     B=matrix(NaN,nrow=(A+1),ncol=4) +     k - 1 +     for (i in 3:A){ +         for (j in i:A) { +             B[k,] - c(NaN, i-2, i-1, j) +             k - k + 1 +         } +     }

Re: [R] ColorRamp different from ColorRampPalette

2009-04-17 Thread hadley wickham
Look at the output of pal.cr((0:40)/40) Hadley On Fri, Apr 17, 2009 at 2:42 PM, Etienne B. Racine etienn...@gmail.com wrote: I try to use ColorRamp as ColorRampPalette (i.e. with the same gradient), but it seems there is a nuance that I've missed. pal.crp-colorRampPalette( c(blue, white,

Re: [R] Extending a vector to length n

2009-04-16 Thread hadley wickham
-help-boun...@r-project.org] On Behalf Of hadley wickham Sent: Wednesday, April 15, 2009 10:55 AM To: r-help Subject: [R] Extending a vector to length n In general, how can I increase a vector of length m ( n) to length n by padding it with m - n missing values, without losing attributes

[R] Extending a vector to length n

2009-04-15 Thread hadley wickham
In general, how can I increase a vector of length m ( n) to length n by padding it with m - n missing values, without losing attributes? The two approaches I've tried, using length- and adding missings with c, do not work in general: a - as.Date(2008-01-01) c(a, NA) [1] 2008-01-01 NA length(a)

[R] [R-pkgs] [ANN] plyr version 0.1.7

2009-04-15 Thread hadley wickham
plyr is a set of tools for a common set of problems: you need to break down a big data structure into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to: * fit the same model to subsets of a data frame * quickly calculate

Re: [R] Concatenation, was Re: Physical Units in Calculations

2009-04-13 Thread hadley wickham
On Mon, Apr 13, 2009 at 4:15 AM, Peter Dalgaard p.dalga...@biostat.ku.dk wrote: Stavros Macrakis wrote: It would of course be nice if the existing difftime class could be fit into this, as it is currently pretty much a second-class citizen.  For example, c of two time differences is currently

Re: [R] Help with postscript (huge file size)

2009-04-12 Thread hadley wickham
I'm generating some images in R to put into a document that I'm producing using Latex. This document in Latex is following a predefined model, which does not accept compilation with pdflatex, so I have to compile with latex - dvi - pdf. Because of that, I have to generate the images in R with

Re: [R] Display a very low p-value

2009-04-08 Thread hadley wickham
 pnorm(37:39,lower.tail=FALSE) [1] 5.725571e-300  0.00e+00  0.00e+00  This is just a limitation of double precision floating-point arithmetic ...  curve(pnorm(x,lower.tail=FALSE),from=30,to=40,log=y) .Machine$double.xmin But note curve(pnorm(x,lower.tail=FALSE,

Re: [R] change inter-line spacing in grid graphics - how to?

2009-04-07 Thread hadley wickham
Have a look at ?gpar - it will tell you about lineheight. Hadley On Tue, Apr 7, 2009 at 3:28 AM, Mark Heckmann mark.heckm...@gmx.de wrote: I am trying to change the inter-line spacing in grid.text(), but I just don't find how to do it. pushViewport(viewport()) grid.text(The inter-line

Re: [R] Using as.formula() with the reshape package cast

2009-04-07 Thread hadley wickham
On Tue, Apr 7, 2009 at 8:44 AM, ryan.shef...@malbecpartners.com wrote: I am trying to use the cast function from the reshape package, where the formula is not passed in directly, but as the result of the as.formula() function. Using reshape v. 0.7.2 I am able to properly melt() by data

Re: [R] newbie query: simple crosstabs

2009-04-07 Thread hadley wickham
On Tue, Apr 7, 2009 at 4:41 PM, Jorge Ivan Velez jorgeivanve...@gmail.com wrote: Hi Eik, You're absolutely right. My bad. Here is the correction of the code I sent: apply(mydata[,-1], 2, tapply, mydata[,1], function(x) sum(x)/length(x)) Or more simply: apply(mydata[,-1], 2, tapply,

Re: [R] package: maps and spatstat question

2009-04-06 Thread hadley wickham
Hi Laura, You might find the map_data function from the ggplot2 package helpful: library(ggplot2) library(maps) head(map_data(state, iowa)) It formats the output of the map command into a self-documenting data frame. Hadley On Mon, Apr 6, 2009 at 7:00 AM, Laura Chihara lchih...@carleton.edu

Re: [R] Best way to turn a list into a data.frame

2009-04-06 Thread hadley wickham
On Mon, Apr 6, 2009 at 8:49 AM, Daniel Brewer daniel.bre...@icr.ac.uk wrote: Hello, What is the best way to turn a list into a data.frame? I have a list with something like: $`3845`  [1] 04010 04012 04360 $`1029` [1] 04110 04115 And I would like to get a data frame like the following:

Re: [R] SUM,COUNT,AVG

2009-04-06 Thread hadley wickham
On Mon, Apr 6, 2009 at 9:34 AM, Stavros Macrakis macra...@alum.mit.edu wrote: There are various ways to do this in R. # sample data dd - data.frame(a=1:10,b=sample(3,10,replace=T),c=sample(3,10,replace=T)) Using the standard built-in functions, you can use: *** aggregate ***

Re: [R] Collapse data matrix with extra info separated by commas

2009-04-06 Thread hadley wickham
On Mon, Apr 6, 2009 at 10:40 AM, baptiste auguie ba...@exeter.ac.uk wrote: Here's one attempt with plyr, hopefully Hadley will give you a better solution ( I could not get cast() to do it either) test - data.frame(a=c(A,A,A,A,B,B,B),b=c(1,1,2,2,1,1,1),c=sample(1:7))

Re: [R] SUM,COUNT,AVG

2009-04-06 Thread hadley wickham
On Mon, Apr 6, 2009 at 5:31 PM, Jun Shen jun.shen...@gmail.com wrote: This is a good example to compare different approaches. My understanding is aggregate() can apply one function to multiple columns summarize() can apply multiple functions to one column I am not sure if ddply() can actually

Re: [R] data.frame, converting row data to columns

2009-04-04 Thread hadley wickham
On Sat, Apr 4, 2009 at 12:09 PM, ds dhsha...@acad.umass.edu wrote: I have a data frame something like:                      name         wrist nLevel            emot 1                    4094          3.34                    1 frustrated 2                    4094          3.94              

Re: [R] data.frame, converting row data to columns

2009-04-04 Thread hadley wickham
On Sat, Apr 4, 2009 at 12:28 PM, jim holtman jholt...@gmail.com wrote: Does this do what you want: x - read.table(textConnection(name         wrist nLevel            emot + 1                    4094          3.34                    1   frustrated + 2                    4094          3.94      

Re: [R] plyr and table question

2009-04-03 Thread hadley wickham
On Fri, Apr 3, 2009 at 4:43 AM, baptiste auguie ba...@exeter.ac.uk wrote: Dear all, I'm puzzled by the following example inspired by a recent question on R-help, cc - textConnection(user_id  website          time 20        google            0930 21        yahoo            0935 20        

Re: [R] plyr and table question

2009-04-03 Thread hadley wickham
On Fri, Apr 3, 2009 at 8:43 AM, baptiste auguie ba...@exeter.ac.uk wrote: That makes sense, so I can do something like, count - function(x){        as.integer(unclass(table(x))) } count(d$user_id) ddply(d, .(user_id), transform, count = count(user_id))  user_id  website time count 1    

Re: [R] data.frame to array?

2009-04-03 Thread hadley wickham
On Fri, Apr 3, 2009 at 1:45 PM, rkevinbur...@charter.net wrote: I have a list of data.frames str(bins) List of 19217  $ 100026:'data.frame': 1 obs. of  6 variables:  ..$ Sku  : chr 100026  ..$ Bin  : chr T149C  ..$ Count: int 108  ..$ X    : int 20  ..$ Y    : int 149  ..$ Z    : chr

Re: [R] Deleting rows based on identity variable

2009-04-02 Thread hadley wickham
On Thu, Apr 2, 2009 at 3:37 PM, Rowe, Brian Lee Yung (Portfolio Analytics) b_r...@ml.com wrote: Is this what you want: d1[which(id != 4),] Or just d1[id != 4, ] Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list

Re: [R] Selecting all rows of factors which have at least one positive value?

2009-04-02 Thread hadley wickham
  X1 X2 1  11  0 2  11  0 3  11  0 4  11  1 5  12  0 6  12  0 7  12  0 8  13  0 9  13  1 10 13  1 and I want to select all rows pertaining to factor levels of X1 for which exists at least one 1 for X2. To be clear, I want rows 1:4 (since there exists at least one observation for

Re: [R] Calculating First Occurance by a factor

2009-04-01 Thread hadley wickham
I tried messing with the line df$FixTime[which.min(df$FixInx)] changing it to df[which.min(df$FixInx)] or adding new lines with the additional columns that I want to include, but nothing seemed to work. I'll admit I only have a mild understanding of what is going on with the function .fun. :-)

Re: [R] Calculating First Occurance by a factor

2009-04-01 Thread hadley wickham
On Wed, Apr 1, 2009 at 11:00 AM, hadley wickham h.wick...@gmail.com wrote: I tried messing with the line df$FixTime[which.min(df$FixInx)] changing it to df[which.min(df$FixInx)] or adding new lines with the additional columns that I want to include, but nothing seemed to work. I'll admit I only

Re: [R] Public R servers?

2009-04-01 Thread hadley wickham
Earlier I posted a question about memory usage, and the community's input was very helpful.  However, I'm now extending my dataset (which I use when running a regression using lm).  As a result, I am continuing to run into problems with memory usage, and I believe I need to shift to

[R] Bug in col2rgb?

2009-03-31 Thread hadley wickham
col2rgb(#0079, TRUE) [,1] red 0 green0 blue 0 alpha 121 col2rgb(#0080, TRUE) [,1] red255 green 255 blue 255 alpha0 col2rgb(#0081, TRUE) [,1] red 0 green0 blue 0 alpha 129 Any ideas? Thanks, Hadley -- http://had.co.nz/

Re: [R] Using apply to get group means

2009-03-31 Thread hadley wickham
On Tue, Mar 31, 2009 at 11:31 AM, baptiste auguie ba...@exeter.ac.uk wrote: Not exactly the output you asked for, but perhaps you can consider, library(doBy) summaryBy(x3~x2+x1,data=x,FUN=mean)  x2 x1 x3.mean 1  1  A     1.5 2  1  B     2.0 3  1  C     3.5 4  2  A     4.0 5  2  B    

Re: [R] Reshape: 'melt' numerous objects

2009-03-31 Thread hadley wickham
On Tue, Mar 31, 2009 at 11:12 AM, Steve Murray smurray...@hotmail.com wrote: Dear R Users, I'm trying to use the reshape package to 'melt' my gridded data into column format. I've done this before on individual files, but this time I'm trying to do it on a directory of files (with variable

Re: [R] ggplot: order of numeric factor levels?

2009-03-31 Thread hadley wickham
On Tue, Mar 31, 2009 at 5:01 PM, Marianne Promberger mprom...@psych.upenn.edu wrote: Hi, I'm having problems with qplot and the order of numeric factor levels. Factors with numeric levels show up in the order in which they appear in the data, not in the order of the levels (as far as I

Re: [R] how to input multiple .txt files

2009-03-30 Thread hadley wickham
On Mon, Mar 30, 2009 at 10:33 AM, Mike Lawrence mike.lawre...@dal.ca wrote: To repent for my sins, I'll also suggest that Hadley Wickham's plyr package (http://had.co.nz/plyr/) is also useful/parsimonious in this context: a - ldply(cust1_files,read.table) You might also want to do

Re: [R] Calculating First Occurance by a factor

2009-03-30 Thread hadley wickham
On Mon, Mar 30, 2009 at 2:58 PM, Mike Lawrence mike.lawre...@dal.ca wrote: I discovered Hadley Wickham's plyr package last week and have found it very useful in circumstances like this: library(plyr) firstfixtime = ddply(       .data = data       , .variables = c('Sub','Tr','IA')       ,

Re: [R] histogram plots with many different samples

2009-03-25 Thread hadley wickham
Or use frequency polygons, if you want to stay with the interpretability of a histogram. Hadley On Wed, Mar 25, 2009 at 12:07 PM, Greg Snow greg.s...@imail.org wrote: Personally I find those types of plots difficult to interpret.  Much easier to create, view, and interpret is to simply plot

Re: [R] Following progress in a lapply() function

2009-03-22 Thread hadley wickham
On Sun, Mar 22, 2009 at 5:06 PM, Blanchette, Marco m...@stowers.org wrote: Dear all, I am processing a very long and complicated list using lapply through a custom function and I would like to generate some sort of progress report. For instance, print a dot on the screen every time 1000

Re: [R] Retrieving Vertices Coordinates from SpatialPolygons

2009-03-21 Thread hadley wickham
This came up on R-sig-geo two days ago and this is what I said: I have the following code in ggplot2 for turning a SpatialPolygon into a regular data frame of coordinates. You'll need to load ggplot2, and then run fortify(yoursp). fortify.SpatialPolygonsDataFrame - function(shape, region =

Re: [R] Fisher test accuracy in doubt

2009-03-21 Thread hadley wickham
On Sat, Mar 21, 2009 at 2:03 PM, joker77 vijumo...@gmail.com wrote: Hi, I noted a discrepancy between R and openepi when I ran a fisher test with the same matrix. In R: a=matrix(c(1,2,6,17), nrow=2) a     [,1] [,2] [1,]    1    6 [2,]    2   17 fisher.test(a, conf.int=T)        

Re: [R] ggplot2: specifying legend titles

2009-03-20 Thread hadley wickham
On Fri, Mar 20, 2009 at 9:07 AM, Etches Jacob jetc...@iwh.on.ca wrote: I am trying to specify a legend title to be other than the variable name, but I find that the legend splits because scale_shape() takes effect but scale_colour() does not.  Can someone spot my error?  Here's some toy code

Re: [R] how to make aggregation in R ?

2009-03-19 Thread hadley wickham
On Thu, Mar 19, 2009 at 8:40 PM, jim holtman jholt...@gmail.com wrote: Try this technique.  I use it with large data objects since it is sometime faster, and uses less memory, by using indices: x - read.table(textConnection(  v1 v2 n1 n2 1   a a1  1 21 2   a a1  2 22 3   a a1  3 23 4   a

Re: [R] R-code in html help pages: syntax highlighting

2009-03-16 Thread hadley wickham
It would be pretty easy to use the output from the R parser (which is never wrong, is it?), and dump some markup out of it. For example the showTree function in codetools dumps an R expression as Lisp, this is not too far from generating html, or any other markup. As this sounds like fun,

Re: [R] - help - predicting with glmnet/lars for dataframes with different nrow then the train set

2009-03-16 Thread hadley wickham
On Mon, Mar 16, 2009 at 7:21 PM, eitan lavi lavi.ei...@gmail.com wrote: Hello I'm having trouble using lars and glmnet functions to predict on a new data set with different nrow then the original : for instance: =    log.1 = glm(temp.data$TL~(.),temp.data,family =

Re: [R] Selecting / creating unique colours for behavioural / transitional data

2009-03-13 Thread hadley wickham
Thanks for the reply - some of the sets/palettes in the RColorBrewer are ideal, but the problem with the problem i have is that they only go up to 12 colours, and i need 15 colours - so i assume the only thing i can do is create my own palette, but i'm having limited success in trying to work

Re: [R] Unable to run smoother in qplot() or ggplot() - complains about knots

2009-03-13 Thread hadley wickham
On Thu, Mar 12, 2009 at 5:37 PM, Christopher David Desjardins cddesjard...@gmail.com wrote: I get the following error when I run qplot() qplot(grade, read,data = hhm.long.m, geom = c(point, smooth)) Error in smooth.construct.cr.smooth.spec(object, data, knots) :  x has insufficient unique

Re: [R] adding text and other elements to ggplot2 plots

2009-03-11 Thread hadley wickham
Have a look at the annotations section of http://had.co.nz/ggplot2/book/toolbox.pdf Hadley On Wed, Mar 11, 2009 at 8:44 AM, levyofi levy...@post.tau.ac.il wrote: Hello, I really like the interface and flexibility of the ggplots package. However, I cannot find how to add text to a plot (like

<    3   4   5   6   7   8   9   10   11   12   >