Re: [R] Find index of a string inside a string?

2010-10-25 Thread Hadley Wickham
Or str_locate: library(stringr) str_locate(aabcd, bcd) Hadley On Mon, Oct 25, 2010 at 5:53 AM, jim holtman wrote: I think what you want is 'regexpr': regexpr(bcd, aabcd) [1] 3 attr(,match.length) [1] 3 On Mon, Oct 25, 2010 at 7:27 AM, yoav baranan

Re: [R] Which version control system to learn for managing R projects?

2010-10-26 Thread Hadley Wickham
git is where the world is headed.  This video is a little old:, but does a good job getting the point across. And lots of R users are using github already: Hadley -- Assistant Professor / Dobelman Family Junior

Re: [R] Forcing results from lm into datframe

2010-10-26 Thread Hadley Wickham
On Tue, Oct 26, 2010 at 11:55 AM, Dennis Murphy wrote: Hi: When it comes to split, apply, combine, think plyr. library(plyr) ldply(split(afvtprelvefs, afvtprelvefs$basestudy),         function(x) coef(lm (ef ~ quartile, data=x, weights=1/ef_std))) Or do it in two steps:

Re: [R] Which version control system to learn for managing R projects?

2010-10-26 Thread Hadley Wickham
1. What is everyone else using?  The network effect is important since you want people to be able to access your repository and you want to leverage your knowledge of the version control system for other projects' repositories.  To that extent Subversion is the clear choice since its used on

Re: [R] overloading the generic primitive functions + and [

2010-10-28 Thread Hadley Wickham
Note how S3 methods are dispatched only by reference to the first argument (on the left of the operator). I think S4 beats this by having signatures that can dispatch depending on both arguments. That's somewhat of a simplification for primitive binary operators. R actually looks up the method

Re: [R] avoiding too many loops - reshaping data

2010-11-04 Thread Hadley Wickham
Beware of facile comparisons of this sort -- they may be apples and nematodes. And they also imply that the main time sink is the computation. In my experience, figuring out how to solve the problem using takes considerably more time than 18 / 1000 seconds, and so investing your energy in

Re: [R] Heatmap construction problems

2010-11-07 Thread Hadley Wickham
It's hard to know without a minimal reproducible example, but you probably want scale_fill_gradient or scale_fill_gradientn. Hadley On Thu, Oct 28, 2010 at 9:42 AM, Struchtemeyer, Chris wrote: I am very new to R and don't have any computer program experience whatsoever.  I

Re: [R] ggplot2: facet_grid with only one level does not display the graph with the facet_grid level in title

2010-11-07 Thread Hadley Wickham
This is on my to do list: Hadley On Thu, Oct 28, 2010 at 11:51 AM, Matthew Pettis wrote: Hi All, Here is the code that I'll be referring to: p - ggplot(, aes(PER_KEY, EVENTS)) (p - p +    

[R] How to detect if a vector is FP constant?

2010-11-08 Thread Hadley Wickham
Hi all, What's the equivalent to length(unique(x)) == 1 if want to ignore small floating point differences? Should I look at diff(range(x)) or sd(x) or something else? What cut off should I use? If it helps to be explicit, I'm interested in detecting when a vector is constant for the purpose

Re: [R] How to detect if a vector is FP constant?

2010-11-08 Thread Hadley Wickham
I think this does what you want (borrowing from all.equal.numeric): all(abs((x - mean(x))) .Machine$double.eps^0.5) with a vector of length 1 million, it took .076 seconds on a fairly old system. Hmmm, maybe I want: all.equal(min(x), max(x)) ? Hadley -- Assistant Professor / Dobelman

Re: [R] Extending the accuracy of exp(1) in R

2010-11-09 Thread Hadley Wickham
Where the value of exp(1) as computed by R is concerned, you have been deceived by what R displays (prints) on screen. The default is to display any number to 7 digits of accuracy, but that is not the accuracy of the number held internally by R:  exp(1)  # [1] 2.718282  exp(1) - 2.718282  

Re: [R] sum in vector

2010-11-17 Thread Hadley Wickham
rowsum(value, paste(factor1, factor2, factor3)) That is dangerous in general, and always inefficient. Imagine factor1 is c(a, a b) and factor2 is (b c, c). Use interaction with drop = T. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University

Re: [R] Help on running regression by grouping firms

2010-11-25 Thread Hadley Wickham
res - function(x) resid(x) ds_test$u -, llply(mods, res)) I'd be a little careful with this, because there's no guarantee the results will by ordered in the same way as the input (and I'd also prefer ds_test$u - unlist(llply(mods, res)) or ds_test$u - laply(mods, res)) In your case,

Re: [R] Go (back) from Rd to roxygen

2010-11-25 Thread Hadley Wickham
Since roxygen is a great help to document R packages, I am wondering if there exists an approach to go back from the raw Rd files to roxygen-documentation? E.g. turn \author{Somebody} into @author Somebody. This sounds ridiculous, but I believe it helps in the long term for me to maintain R

Re: [R] ggplot2 histograms

2010-11-30 Thread Hadley Wickham
You may find it easier to use a frequency polygon, geom = freqpoly. Hadley On Tue, Nov 30, 2010 at 2:36 PM, Small Sandy (NHS Greater Glasgow Clyde) wrote: Hi With ggplot2 I can very easily create beautiful histograms but I would like to put two histograms on the same

Re: [R] Fast string comparison

2010-07-12 Thread Hadley Wickham
strings - replicate(1e5, paste(sample(letters, 100, rep = T), collapse = )) system.time(strings[-1] == strings[-1e5]) # user system elapsed # 0.016 0.000 0.017 So it takes ~1/100 of a second to do ~100,000 string comparisons. You need to provide a reproducible example that illustrates

Re: [R] How to define a function (with '-') that has two arguments?

2010-07-14 Thread Hadley Wickham
On Wed, Jul 14, 2010 at 7:39 AM, wrote: Hi All, The last line if the following code returns the error right below this paragraph. Essentially, I use the operator %:% to retrieve a variable in a nested frame. Then I want to use the same operator

Re: [R] [R-pkgs] New package list for analyzing list surveyexperiments

2010-07-15 Thread Hadley Wickham
For some reason package writers seem to prefer maximally uninformative names for their packages.  To take some examples of recently announced packages, can anyone guess what packages 'FDTH', 'rtv', or 'lavaan' do?  Why the aversion to informative names along the lines of

Re: [R] qplot in ggplot2 not working any longer - (what did I do?)

2010-07-15 Thread Hadley Wickham
For a quick fix, you probably need to reinstall plyr. Hadley On Wed, Jul 14, 2010 at 11:03 PM, stephen sefick wrote: This is the first time that I have tried to update packages with a tinkered around with .Rprofile.  I start R with R --vanilla and it does not load my

Re: [R] Recommended way of requiring packages of a certain version?

2010-07-16 Thread Hadley Wickham
So distributing code to other people is preferably done using R packages, which gives you this option. However (as far as I am aware), note that this option is checked at package build time, not at load time. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of

Re: [R] a very particular plot

2010-07-16 Thread Hadley Wickham
On Wed, Jul 14, 2010 at 1:32 AM, Ian Bentley wrote: I've got a couple of more changes that I want to make to my plot, and I can't figure things out.  Thanks for all the help. I'm using this R script library(ggplot2) library(lattice) # Generate 50 data sets of size 100

Re: [R] Creating Enumerated Variables

2010-07-16 Thread Hadley Wickham
On Thu, Jul 15, 2010 at 11:08 PM, Dennis Murphy wrote: Hi: I sincerely hope there's an easier way, but one method to get this is as follows, with d as the data frame name of your test data: d - d[order(with(d, Age, School, rev(Grade))), ] d$Count -, mapply(seq,

Re: [R] NA preserved in logical call - I don't understand this behavior because NA is not equal to 0

2010-07-18 Thread Hadley Wickham
The problem is in data.frame[ and any NA in a logical vector will return a row of NA's. This can be avoid by wrapping which() around the logical vector which seems entirely wasteful or using subset(). The basic philosophy that causes this behaviour is sensible in my opinion: missing values

Re: [R] best way to apply a list of functions to a dataset ?

2010-07-21 Thread Hadley Wickham
ddply(ma, .(variable), summarise, mean = mean(value), sd = sd(value),       skewness = skewness(value), median = median(value), = In principle, you should be able to do: ddply(ma, .(variable), colwise(each(mean, sd, skewness, median, but

Re: [R] using sample() for a vector of length 1

2010-07-22 Thread Hadley Wickham
Did you look at the examples in sample? # sample()'s surprise -- example x - 1:10 sample(x[x 8]) # length 2 sample(x[x 9]) # oops -- length 10! sample(x[x 10]) # length 0 ## For R = 2.11.0 only resample - function(x, ...) x[, ...)] resample(x[x 8]) # length

Re: [R] p-VALUE calculation

2010-07-22 Thread Hadley Wickham
What is your null hypothesis? What is your alternate hypothesis? What is the test statistic? Why do you want a p-value? Hadley On Thu, Jul 22, 2010 at 5:40 PM, jd6688 wrote: Here is my dataframe with 1000 rows: employee_id         weigth       p-value 100            

Re: [R] union data in column

2010-07-24 Thread Hadley Wickham
On Sat, Jul 24, 2010 at 2:23 AM, Jeff Newmiller wrote: Fahim Md wrote: Is there any function/way to merge/unite the following data  GENEID      col1          col2             col3                col4  G234064         1             0                  0                

[R] [R-pkgs] plyr version 1.1

2010-07-26 Thread Hadley Wickham
plyr is a set of tools for a common set of problems: you need to break down a big data structure into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to: * fit the same model to subsets of a data frame * quickly calculate

Re: [R] looking for setdiff equivalent on dataset

2010-07-29 Thread Hadley Wickham
Here's one way, using a function from the plyr package: TheLittleOne-data.frame(cbind(c(2,3),c(2,3))) TheBigOne-data.frame(cbind(c(1,1,2),c(1,1,2))) keys - plyr:::join.keys(TheBigOne, TheLittleOne) !(keys$x %in% keys$y) TheBigOne[!(keys$x %in% keys$y), ] Hadley On Thu, Jul 29, 2010 at 1:38

Re: [R] looking for setdiff equivalent on dataset

2010-07-29 Thread Hadley Wickham
Well, here's one way that might work (explanation below): The ideas is to turn each row into a character vector and then work with the two character vectors. bigs -,TheBigOne) ix -  which(bigs %in% setdiff(bigs,,TheLittleOne))) TheBigOne[ix,] However, this may

Re: [R] image plot but data not on grid.

2010-08-06 Thread Hadley Wickham
On Fri, Aug 6, 2010 at 9:24 AM, W Eryk Wolski wrote: Hi, Would like to make an image however the values in z are not on an uniform grid. Have a dataset with length(x) == length(y) == length(z) x[1],y[1] gives the position of z[1] and would like to encode value of z by

Re: [R] image plot but data not on grid.

2010-08-07 Thread Hadley Wickham
On Sat, Aug 7, 2010 at 2:54 AM, Michael Bedward wrote: On 7 August 2010 06:26, Hadley Wickham wrote: library(ggplot2) qplot(x, y, fill = z, data = df, geom = tile) Hi Hadley, I read the original question as being about irregularly spaced data. The above method

Re: [R] image plot but data not on grid.

2010-08-09 Thread Hadley Wickham
With sweave, you need to explicitly print() the output of ggplot2 and lattice plots. Hadley On Mon, Aug 9, 2010 at 6:32 AM, W Eryk Wolski wrote: qplot does (?) what I was looking for! At least it plots what I want to plot in the interactive modus. However, it seems not to

Re: [R] coef(summary) and plyr

2010-08-09 Thread Hadley Wickham
On Mon, Aug 9, 2010 at 9:29 AM, David Winsemius wrote: If you look at the output (as I did)  you should see that despite whatever expectations you have developed regarding plyr, that it did not produce a grouping variable: ldply(dl, function(x) coef(summary(x)) )   fac

Re: [R] coef(summary) and plyr

2010-08-09 Thread Hadley Wickham
There is one further improvement to consider. When I tried using dlply to tackle a problem on which I had been bashing my head for the last three days and it gave just the results I had been looking for, I also noticed that the dlply function returns the grouping variable levels in an

Re: [R] coef(summary) and plyr

2010-08-09 Thread Hadley Wickham
That's exactly what dlply does - so you should never have to do that yourself. I'm unclear what you are saying. Are you saying that the plyr function _should_ have examined the objects in that list and determined that there were 4 rows and properly labeled the rows to indicate which list

Re: [R] coef(summary) and plyr

2010-08-09 Thread Hadley Wickham
On Mon, Aug 9, 2010 at 4:30 PM, Matthew Dowle wrote: Another option for consideration : library(data.table) mydt = mydt[,as.list(coef(lm(y~x1+x2+x3))),by=fac]     fac X.Intercept.       x1       x2        x3 [1,]   0  -0.16247059 1.130220

Re: [R] ggplot2 histograms... a subtle error found

2010-08-09 Thread hadley wickham
When ggplot2 verifies the widths before stacking (the default position for histograms), it computes the widths from the minimum and maximum values for each bin.  However, because the width of the bins (0.28) is much smaller than the scale of the edges (6.8e+09), there is some underflow and the

Re: [R] drawing dot plots with size, shape affecting dot characteristics

2010-08-12 Thread Hadley Wickham
On Wed, Aug 11, 2010 at 10:14 PM, Brian Tsai wrote: Hi all, I'm interested in doing a dot plot where *both* the size and color (more specifically, shade of grey) change with the associated value. I've found examples online for ggplot2 where you can scale the size of the

Re: [R] problems with merge() - the output has many repeated lines

2010-08-21 Thread Hadley Wickham
You may find a close reading of ?merge helpful, particularly this sentence: If there is more than one match, all possible matches contribute one row each (so check that you don't have multiple matches). Hadley On Sat, Aug 21, 2010 at 10:45 AM, Cecilia Carmo wrote: Hi

[R] Recyclable

2010-08-23 Thread Hadley Wickham
Hi all, Is there a function to determine whether a set of vectors is cleanly recyclable? i.e. is there a common function for detecting the error/warnings that underlie the following two function calls? 1:3 + 1:2 [1] 2 4 4 Warning message: In 1:3 + 1:2 : longer object length is not a multiple

Re: [R] Recyclable

2010-08-23 Thread Hadley Wickham
I should note that I realise this function is pretty trivial to write (see below), I just want to avoid reinventing the wheel. recyclable - function(...) { lengths - vapply(list(...), length, 1) all(max(lengths) %% lengths == 0) } Hadley On Mon, Aug 23, 2010 at 10:33 AM, Hadley Wickham had

[R] Comparing/diffing strings

2010-08-24 Thread Hadley Wickham
Hi all, all.equal is generally very useful when you want to find the differences between two objects. It breaks down however, when you have two long strings to compare: all.equal(a, b) [1] 1 string mismatch Does any one know of any good text diffing tools implemented in R? Thanks, Hadley

Re: [R] change order of plot panels in faceted ggplot/qplot

2010-08-24 Thread Hadley Wickham
On Mon, Aug 23, 2010 at 1:02 PM, Alison Macalady wrote: Hi, I have a 5-paneled figure that i made using the facet function in qplot (ggplot).  I've managed to arrange the panels into two rows/three columns, but for the sake of easy visual comparisons between panels in my

Re: [R] Comparing/diffing strings

2010-08-24 Thread Hadley Wickham
On Tue, Aug 24, 2010 at 11:25 AM, Martin Morgan wrote: On 08/24/2010 07:27 AM, Doran, Harold wrote: There is the stringMatch function in the MiscPsycho package. stringMatch('Hadley', 'Hadley Wickham', normalize = 'no') [1] 8 stringMatch('Hadley', 'Hadley Wickham

Re: [R] Plot bar lines like excel

2010-08-25 Thread Hadley Wickham
On Wed, Aug 25, 2010 at 6:05 AM, abotaha wrote: Woow, it is amazing, thank you very much. yes i forget to attach the dates, however, the dates in my case is every 16 days. so how i can use 16 day interval instead of month in by option. Here's one way using the lubridate

[R] [R-pkgs] stringr: version 0.4

2010-08-25 Thread Hadley Wickham
Strings are not glamorous, high-profile components of R, but they do play a big role in many data cleaning and preparations tasks. R provides a solid set of string operations, but because they have grown organically over time, they can be inconsistent and a little hard to learn. Additionally, they

Re: [R] log y 'axis' of histogram

2010-08-30 Thread Hadley Wickham
It's not just that counts might be zero, but also that the base of each bar starts at zero. I really don't see how logging the y/axis of a histogram makes sense. Hadley On Sunday, August 29, 2010, Joshua Wiley wrote: Hi Derek, Here is an option using the package

Re: [R] log y 'axis' of histogram

2010-08-30 Thread Hadley Wickham
I have counts ranging over 4-6 orders of magnitude with peaks occurring at various 'magic' values.  Using a log scale for the y-axis enables the smaller peaks, which would otherwise be almost invisible bumps along the x-axis, to be seen That doesn't justify the use of a _histogram_ - and

Re: [R] log y 'axis' of histogram

2010-08-30 Thread Hadley Wickham
That doesn't justify the use of a _histogram_  - and regardless of The usage highlights meaningful characteristics of the data. What better justification for any method of analysis and display is there? That you're displaying something that is mathematically well founded and meaningful - but

Re: [R] how to replace NA with a specific score that is dependant on another indicator variable

2010-09-01 Thread Hadley Wickham
first ddply result did I see that some sort of misregistration had occurred; Better with: res -ddply(egraw2, .(category), .fun=function(df) {               sapply(df,                    function(x) {mnx - mean(x, na.rm=TRUE);                                 sapply(x, function(z) if

[R] [R-pkgs] testthat: version 0.3

2010-09-01 Thread Hadley Wickham
# testthat Testing your code is normally painful and boring. `testthat` tries to make testing as fun as possible, so that you get a visceral satisfaction from writing tests. Testing should be fun, not a drag, so you do it all the time. To make that happen, `testthat`: * Provides functions that

Re: [R] Please explain in this context, or critique to stack this list faster

2010-09-04 Thread Hadley Wickham
One common way around this is to pre-allocate memory and then to populate the object using a loop, but a somewhat easier solution here turns out to be ldply() in the plyr package. The following is the same idea as, l), only faster: system.time(u3 - ldply(l, rbind))   user  

Re: [R] Aggregating data from two data frames

2010-09-08 Thread Hadley Wickham
Have a look at match and merge. Hadley On Wednesday, September 8, 2010, Michael Haenlein wrote: Dear all, I'm working with two data frames. The first frame (agg_data) consists of two columns. agg_data[,1] is a unique ID for each row and agg_data[,2] contains a

Re: [R] adding list to data.frame iteratively

2010-09-08 Thread Hadley Wickham
Why don't you read the answers to your stackoverflow question? Hadley On Wed, Sep 8, 2010 at 1:17 AM, wrote: Hi, I have a preallocated dataframe to

Re: [R] Strange output daply with empty strata

2010-09-09 Thread hadley wickham
daply(data.test, .(municipality, employed), function(d){mean(d$age)} )     employed municipality   no  yes    A 41.58759 44.67463    B 55.57407 43.82545    C 43.59330   NA The .drop argument has a different meaning in daply. Some R functions have

[R] [R-pkgs] plyr: version 1.2

2010-09-10 Thread Hadley Wickham
plyr is a set of tools for a common set of problems: you need to __split__ up a big data structure into homogeneous pieces, __apply__ a function to each piece and then __combine__ all the results back together. For example, you might want to: * fit the same model each patient subsets of a data

[R] [R-pkgs] reshape2: a reboot of the reshape package

2010-09-10 Thread Hadley Wickham
Reshape2 is a reboot of the reshape package. It's been over five years since the first release of the package, and in that time I've learned a tremendous amount about R programming, and how to work with data in R. Reshape2 uses that knowledge to make a new package for reshaping data that is much

Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?

2010-09-10 Thread Hadley Wickham
I'm having trouble parsing this. What exactly do you want to do? 1 - Put a list as an element of a data.frame. That's quite convenient for my pricing function. I think this is a really bad idea. data.frames are not meant to be used in this way. Why not use a list of lists? It can be very

Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?

2010-09-10 Thread Hadley Wickham
I think this is a really bad idea. data.frames are not meant to be used in this way. Why not use a list of lists? It can be very convenient, but I suspect the original poster is confused about the different between vectors and lists. I wouldn't be surprised if someone were confused, since

Re: [R] Which language is faster for numerical computation?

2010-09-10 Thread Hadley Wickham
On Fri, Sep 10, 2010 at 10:23 AM, Henrik Bengtsson wrote: Don't underestimate the importance of the choice of the algorithm you use.  That often makes a huge difference.  Also, vectorization is key in R, and when you use that you're really up there among the top

Re: [R] Problems with reshape2 on Mac

2010-09-13 Thread Hadley Wickham
Hi Uwe, The problem is most likely because the original poster doesn't have the latest version of plyr. I correctly declare this dependency in the DESCRIPTION (, but unfortunately R doesn't seem to use this information at run time,

Re: [R] post

2010-09-13 Thread Hadley Wickham
Have a look at: Computing Thousands of Test Statistics Simultaneously in R by Holger Schwender and Tina Müller, in Hadley On Mon, Sep 13, 2010 at 4:26 PM, Alexey Ush wrote: Hello, I have a question regarding how to

Re: [R] parallel computation with plyr 1.2.1

2010-09-16 Thread Hadley Wickham
Yes, this was a little bug that will be fixed in the next release. Hadley On Thu, Sep 16, 2010 at 1:11 PM, Dylan Beaudette wrote: Hi, I have been trying to use the new .parallel argument with the most recent version of plyr [1] to speed up some tasks. I can run the

Re: [R] Problem with ggplot2 - Boxplot

2010-09-22 Thread Hadley Wickham
That implies you need to update your version of plyr. Hadley On Wed, Sep 22, 2010 at 4:10 AM, RaoulD wrote: Hi, I am using ggplot2 to create a boxplot that summarizes a continuous variable. This code works fine for me on one PC however when I use it on another it

Re: [R] Script auto-detecting its own path

2010-09-29 Thread Hadley Wickham
Forgive me if this question has been addressed, but I was unable to find anything in the r-help list or in cyberspace. My question is this: is there a function, or set of functions, that will enable a script to detect its own path? I have tried file.path() but that was not what I was

Re: [R] function which can apply a function by a grouping variable and also hand over an additional variable, e.g. a weight

2010-10-01 Thread Hadley Wickham
You might want to check out the plyr package. Hadley On Fri, Oct 1, 2010 at 6:05 AM, Werner W. wrote: Hi, I was wondering if there is an easy way to accomplish the following in R: Often I want to apply a function, e.g. weighted.quantile from the Hmisc package to

Re: [R] plyr: a*ply with functions that return matrices-- possible bug in aaply?

2010-10-04 Thread hadley wickham
 That is, I want to define something like the following using an a*ply method, but aaply gives a result in which the applied .margin(s) do not appear last in the result, contrary to the documentation for ?aaply.  I think this is a bug, either in the function or the documentation, but perhaps

Re: [R] Issue with

2010-10-04 Thread Hadley Wickham
RFF-function(qtype, qOpt,...){} i.e., I have two args that are compulsary and the rest are optional. Now when my user passes the function call, I need to see what optional args are defined and process accordingly...what I have so far is.. RFF-function(qtype, qOpt,...){        mc -

Re: [R] Script auto-detecting its own path

2010-10-04 Thread Hadley Wickham
I'm not sure this will solve the issue because if I move the script, I would still have to go into the script and edit the /path/to/my/script.r, or do I misunderstand your workaround? I'm looking for something like: and which would return something like: [1]

Re: [R] R: Tools for thinking about data analysis and graphics

2010-10-06 Thread Hadley Wickham
On Wed, Oct 6, 2010 at 4:05 PM, Michael Friendly wrote:  I'm giving a talk about some aspects of language and conceptual tools for thinking about how to solve problems in several programming languages for statistical computing and graphics. I'm particularly interested in

Re: [R] Looking for a book/tutorial with the following context:

2010-10-08 Thread Hadley Wickham
Do you also know more references about variables? Unfortunately this was a little bit short so I do not feel 100% sure I completely got it. Try here: It's a work in progress. Hadley -- Assistant Professor / Dobelman Family Junior Chair

Re: [R] can't find and install reshape2??

2010-10-12 Thread Hadley Wickham
My guess is you are using an outdated R version for which the rather new reshape2 package has not been compiled. I wonder if install.packages() could detect this case (e.g. by also checking if the source version is not available), and offer a more informative error message. Hadley --

Re: [R] Query on save.image()

2010-10-14 Thread Hadley Wickham
On Thu, Oct 14, 2010 at 11:56 AM, Joshua Wiley wrote: Hi, I do not believe you can use the save.image() function in this case. save.image() is a wrapper for save() with defaults for the global environment (your workspace).  Try this instead, I believe it does what you

Re: [R] 3D to 2D projection

2009-09-28 Thread hadley wickham
Have you used persp or trans3d before? Here is a little piece of data that I am want to convert to 2d. I can plot (x,z) or (z,y). I know there is a better way to convert it to 2d. I did it long time back in my 3d geometry class. ? Hadley --

Re: [R] sort dates within a factor

2009-09-29 Thread hadley wickham
On Tue, Sep 29, 2009 at 6:58 AM, wrote: Apologies for the misunderstanding. I can come up with a solution that might suit your needs: library(plyr) out - ddply(test, .(nr), function(x) data.frame(date=x$date, index=rank(-as.integer(x$date out[$nr) |

Re: [R] Condition to factor (easy to remember)

2009-09-30 Thread hadley wickham
On Wed, Sep 30, 2009 at 2:32 PM, Ista Zahn wrote: An extremely verbose, but (in my view) easy to understand approach is: data.f - data; data.f[which(data = 10)] - levs[1]; data.f[which(data 10)] - levs[2]; data.f - factor(data.f) All those which()s are unnecessary. And

Re: [R] split-apply question

2009-10-02 Thread hadley wickham
On Fri, Oct 2, 2009 at 4:24 AM, jim holtman wrote: try this: x - read.table(textConnection(x1  x2  x3 + A   1    1.5 + B   2    0.9 + B   3    2.7 + C   7    1.8 + D   7    1.3), header=TRUE) closeAllConnections(), lapply(split(seq(nrow(x)), x$x1),

Re: [R] Stranger Behavior -maybe not

2009-10-03 Thread hadley wickham
The point Duncan was making was What other value would you expect it to have after incrementing from 1 to 200 and stopping? Well, depending on the scoping rules of the language you are used to, you might expect i at the top-level to remain undefined. Hadley --

Re: [R] ggplot2: proper use of facet_grid inside a function

2009-10-05 Thread hadley wickham
Whether or not what follows is to be recommended I don't know, but it seems to work, p - ggplot(diamonds, aes(carat, ..density..)) +  geom_histogram(binwidth = 0.2) x = quote(cut) facets = facet_grid(as.formula(bquote(.~.(x p + facets That's what I'd recommend. You can also just do

Re: [R] ggplot cumsum refined question (?)

2009-10-06 Thread hadley wickham
It is much easier to do you data preparation before plotting. Cummul - ddply(subset(DF, precipitation!=NA), gauge_name, function(x){        x$Cummul - cumsum(x$precipitation)        x }) With a little less typing: Cummul - ddply(subset(DF, precipitation!=NA), gauge_name, transform,

Re: [R] Letter-based representation of pairwise comparisons

2009-10-06 Thread hadley wickham
Please provide a reproducible example. I've had problems with multcompLetters in the past, because I was giving it slightly incorrect input. Hadley On Tue, Oct 6, 2009 at 7:41 AM, goz wrote: hello, i try to use the multcomp letters, but i have problems with my

Re: [R] splitting dataframe, assign to new dataframe, add new rows to new dataframe

2009-10-13 Thread hadley wickham
On Tue, Oct 13, 2009 at 6:57 AM, Ista Zahn wrote: I'm sure there's a really cool way to do this with plyr, although I don't know if my particular plyr version is much better. Anyway here it is: cmbine - read.csv(textConnection('names, mass, classes apple,0.50,1

Re: [R] Use R -- term and logo copyright?

2009-10-13 Thread hadley wickham
Can I use (a) the logo and/or (b) the slogan for the KCL R workshops? I think it is quite clear from my website that this is neither about the Springer book series nor about an R user conference. No, sorry, the logo/name should be used exclusively for the Springer series and the R User

Re: [R] lapply() reccursively

2009-10-13 Thread hadley wickham
Neither a1 nor 2:100 are lists, so it would seem that sapply would be more appropriate. The difference between lapply and sapply is the output, not the input. Hadley -- __ mailing list

Re: [R] ggplot2 scale_shape question

2009-10-14 Thread hadley wickham
Is there a way to have some points solid and some points hollow? I have two classes of points and there are so many points, that it's hard to see just the difference in shapes. I'd like to have one of the classes be hollow in addition to being a different shape. Any help would be grand.

Re: [R] Cacheing computationally expensive getter methods for S4 objects

2009-10-14 Thread hadley wickham
Just a note: this technique is called memoisation. Hadley On Wed, Oct 14, 2009 at 3:42 PM, Benilton Carvalho wrote: Thank you very much, Martin. :) b On Oct 14, 2009, at 5:23 PM, Martin Morgan wrote: Steve Lianoglou wrote: Very clever, that looks to do the trick! I

Re: [R] Why points() is defined specially for a 1 by 2 matrix?

2009-10-19 Thread hadley wickham
 To answer one of your other questions: ggplot (and lattice) is/are very powerful, but base graphics are (a) easier to get your head around and (b) easier to adjust if you don't like the defaults.  Changing things just a little bit in ggplot can be difficult (as an example, the answer to your

Re: [R] Putting names on a ggplot

2009-10-20 Thread hadley wickham
On Sun, Oct 18, 2009 at 10:29 AM, John Kane wrote: Thanks Stefan, the annotate approach works beautifully.  I had not got that far in Hadley's book apparently :( I'm not convinced though that the explaination you shouldn't use aes in this case since nampost, temprange,

Re: [R] Sandard deviation calculation

2009-10-26 Thread hadley wickham
What are the values of  length((Ht_cm[type=='SD'][from_treeline=='above'])[1]) I suspect the error is in the subsetting - the following seems more plausible: Ht_cm[type=='SD' from_treeline=='above'] Hadley -- __

Re: [R] GGPLOT2 Different Layers Different X Values

2009-10-28 Thread Hadley Wickham
Hi John, Could you please provide a small reproducible example? Thanks, Hadley Sent from my iPhone On 26/10/2009, at 6:50 PM, Jonathan Bleyhl wrote: I'm trying to plot values based on a date and then overlay a histogram also by date. The problem is that

Re: [R] ggplot2: stat_bin ..count.. with geom_text when NA is present

2009-10-28 Thread hadley wickham
Hi Bryan, Thanks for the reproducible example. The problem is actually in your code, not mine ;) You probably want: y = min(res, na.rm = TRUE) - 0.1 * diff(range(res, na.rm = TRUE)) Hadley (drop = TRUE solves a difference problem - it controls whether or not to remove bins with zero count)

Re: [R] The system cannot find the file specified

2009-10-29 Thread hadley wickham
Do you have write permission in C:\Program Files\R\R-2.9.2\library?  It could be that the installer just tried to create the QRMlib subdir, and failed, and that's why it doesn't exist. One possible reason for failure is that your virus checker prevented the R installer from creating a new

Re: [R] ggplot2: Histogram with negative values on x-axis doesn't work

2009-10-29 Thread hadley wickham
I can reproduce it with for example x=c(-9.23, -9.56, -1.40) But adding a single positive number, even .001, fixes it, while adding a similar negative number introduces a new error message, so it really looks like a bug in ggplot2 when all the values are negative. Report it to the

Re: [R] multiple pages with ggplot2 facet_wrap?

2009-10-30 Thread hadley wickham
On Wed, Oct 28, 2009 at 8:19 PM, Bill Gillespie wrote: I currently use lattice functions to produce multiple pages of plots using the layout argument to specify the number of rows and columns of panels, e.g., xyplot(price ~ carat | clarity, diamonds, layout = c(2, 2))

Re: [R] Safe way to automatically install required packages...

2009-11-02 Thread hadley wickham
If you package depends on another package, it will be automatically installed. Hadley On Mon, Nov 2, 2009 at 12:56 PM, Jonathan Greenberg wrote: R-helpers:   I'm working on an r-package that I want to make as easy-to-use as possible for a novice R-user, which includes

Re: [R] Patterned shading in ggplot

2009-11-04 Thread hadley wickham
Hi Paul, You might want to try the gray colour scale - scale_fill_grey() Unfortunately grid (the underlying graphics library that ggplot2 uses) does not currently support patterns. Hadley On Wed, Nov 4, 2009 at 4:17 AM, Paul Chatfield wrote: Am trying to produce a

Re: [R] map of a country and its different geographical levels

2009-11-07 Thread hadley wickham
If readShapePoly() (deprecated - use readShapeSpatial() instead) says that the data are not polygons, then they are not. If you want to fill administrative boundaries polygons, you need polygons, not lines. The source you are using is based on OpenStreetMaps, so more likely to be lines, and as

[R] Extracting matched expressions

2009-11-08 Thread Hadley Wickham
Hi all, Is there a tool in base R to extract matched expressions from a regular expression? i.e. given the regular expression (.*?) (.*?) ([ehtr]{5}) is there a way to extract the character vector c(one, two, three) from the string one two three ? Thanks, Hadley --

Re: [R] Extracting matched expressions

2009-11-09 Thread Hadley Wickham
: Is this what you want: x - ' one two three ' y - sub(.*?([^[:space:]]+)[[:space:]]+([^[:space:]]+)[[:space:]]+([ehrt]{5}).*, +     \\1 \\2 \\3, x, perl=TRUE) unlist(strsplit(y, ' ')) [1] one   two   three On Sun, Nov 8, 2009 at 1:51 PM, Hadley Wickham

  1   2   3   4   5   6   7   8   9   10   >