Re: [R] merging to data.frames whose columns are different but follow a pattern.

2011-10-25 Thread Dennis Murphy
Hi: Try this: INDIVIDUAL - transform(INDIVIDUAL, IDHOUS = IDPERS %/% 100) merge(INDIVIDUAL, HOUSHOLD, by = c('IDHOUS', 'T')) IDHOUS T IDPERS SEX PC02 GHS Single COM2 NBPERS NBKID 1 41 1 4101 19 1 NA2 5 3 2 41 1 4102 20 0 NA2 5 3

Re: [R] regression using GMM for mulltiple groups

2011-10-25 Thread Dennis Murphy
There are a few ways to do this. One is to use the lmList() function in the nlme package and use conditioning in the model formula. Another is to use the plyr package to create a list of models from which you can extract pieces of output from each model fit to the data subsets; for example,

Re: [R] extract data for specific levels factor

2011-10-25 Thread Dennis Murphy
Are you trying to separate the substrings in cat? If so, one way is to use the colsplit() function in the reshape2 package, something like (untested since you did not provide a suitable data format with which to work): library('reshape2') splitcat - colsplit(mydata$cat, ' ', names = c('fat',

Re: [R] binning runtimes

2011-10-24 Thread Dennis Murphy
Hi: On Mon, Oct 24, 2011 at 2:01 AM, Giovanni Azua brave...@gmail.com wrote: Hello, Suppose I have the dataset shown below. The amount of observations is too massive to get a nice geom_point and smoother on top. What I would like to do is to bin the data first. The data is indexed by Time

Re: [R] How to selectively sum rows [Beginner question]

2011-10-24 Thread Dennis Murphy
See the count() function in the plyr package; it does fast summation. Something like library('plyr') count(passengerData, c('ORIGIN_WAC', 'DEST_WAC'), 'npassengers') HTH, Dennis On Mon, Oct 24, 2011 at 8:27 AM, asindc siiri...@eastwestcenter.org wrote: Hi, I am new to R so I would appreciate

Re: [R] bestglm function and output in R

2011-10-24 Thread Dennis Murphy
Hi: bestglmtest is your input data frame, is it not? From the names() line, you can see that it has no variable named BestModel that corresponds to a list containing a component named coefficients. Were you perhaps looking for output$BestModel$coefficients ?? Dennis On Mon, Oct 24, 2011 at

Re: [R] Creating a histogram properly

2011-10-24 Thread Dennis Murphy
Hi: Aren't V1 and V2 factors in this data frame? If so, you should be plotting bar charts rather than histograms (they're not the same). Do you want separate graphs for V1 and V2, do you want them stacked, or do you want them dodged (side-by side for each level 001, 002, 003)? In the absence of

Re: [R] tree construction

2011-10-24 Thread Dennis Murphy
What kind of tree do you want? The sos() package can help you find R functions associated with a particular topic: # install.packages('sos') library('sos') findFn('tree') found 2798 matches; retrieving 20 pages, 400 matches. 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Re: [R] Adding points to a wireframe: 'x and units must have length 0' error

2011-10-24 Thread Dennis Murphy
Hi David: When I try your code, I get the wireframe with the x, y, z axes sans bounding cube and points, along with the error message Error using packet 1 object 'pts' not found sessionInfo() R version 2.13.1 (2011-07-08) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1]

Re: [R] Syntax Help for xyplot()

2011-10-24 Thread Dennis Murphy
Hi: You should perhaps do the following in the xyplot() call (untested because the TDS and Cond data frames are missing): xyplot('TDS'$quant ~ 'Cond'$quant | burns.tds.anal$site ) assuming that all three atomic objects have the same length. Caveat emptor. Dennis On Mon, Oct 24, 2011 at 2:17

Re: [R] using predict.lm() within a function

2011-10-24 Thread Dennis Murphy
Hi Michael: Try this: show.beta - function(model, x = 'x', x1, x2, label, col=black, ...) { abline(model, col=col, lwd=2) xs - data.frame(c(x1, x2, x2)) names(xs) - attr(model$coefficients, 'names')[2] ys - predict(model, xs) lines(cbind(xs,ys[c(1,1,2)]),

Re: [R] unfold list (variable number of columns) into a data frame

2011-10-23 Thread Dennis Murphy
Hi: Here's one approach: # Function to process a list component into a data frame ff - function(x) { data.frame(time = x[1], partitioning_mode = x[2], workload = x[3], runtime = as.numeric(x[4:length(x)]) ) } # Apply it to each element of the list: do.call(rbind,

Re: [R] summarizing a data frame i.e. count - group by

2011-10-23 Thread Dennis Murphy
And the plyr version of this would be (using DF as the data frame name) ## transform method, mapping length(runtime) to all observations ## similar to David's results: library('plyr') ddply(DF, .(time, partitioning_mode), transform, n = length(runtime)) # or equivalently, the newer and somewhat

Re: [R] lapply to return vector

2011-10-22 Thread Dennis Murphy
do.call(rbind, lapply(...)) HTH, D. On Sat, Oct 22, 2011 at 1:44 AM, Alaios ala...@yahoo.com wrote: Dear all I have wrote the following line return(as.vector(lapply(as.data.frame(data),min,simplify=TRUE))); I want the lapply to return a vector as it returns a list with elements as shown

Re: [R] R for loop stops after 4 iterations

2011-10-22 Thread Dennis Murphy
Hi: Here are a couple of ways, using the data snippet you provided as the input data frame e. Start by defining the function, which outputs a percentage: f - function(n, mean, sd) { s - rnorm(n, mean = mean, s = sd) round(100 * sum(s 0.42)/length(s), 4) } (1) Use the plyr package

Re: [R] Expanding rows of a data frame into multiple rows

2011-10-22 Thread Dennis Murphy
Here's another approach using the plyr package: # Function to process each row of input: g - function(d) { y - unlist(d$observations) if(length(y) 0) data.frame(site = d$site, sector = d$sector, y = y) else NULL } library('plyr') ddply(input, .(site), g) site sector y 1

Re: [R] stacked plot

2011-10-21 Thread Dennis Murphy
It appears that your object is currently a matrix. Here's a toy example to illustrate how to get a stacked bar chart in ggplot2: library('ggplot2') m - matrix(1:9, ncol = 3, dimnames = list(letters[1:3], LETTERS[1:3])) (d - as.data.frame(as.table(m))) Var1 Var2 Freq 1aA1 2bA

Re: [R] lattice::xyplot/ggplot2: plotting weighted data frames with lmline and smooth

2011-10-21 Thread Dennis Murphy
Hi Michael: Here's one way to get it from ggplot2. To avoid possible overplotting, I jittered the points horizontally by +/- 0.2. I also reduced the point size from the default 2 and increased the line thickness to 1.5 for both fitted curves. In ggplot2, the term faceting is synonymous with

Re: [R] plotting average effects.

2011-10-21 Thread Dennis Murphy
Hi: Your approach to computing the means is not efficient; a better way would be to use the aggregate() function. I would start by combining the grouping variable and the three prediction variables into a data frame. To get the groupwise mean for all three prediction variables, you can use a

Re: [R] lattice::xyplot/ggplot2: plotting weighted data frames with lmline and smooth

2011-10-21 Thread Dennis Murphy
(legend.position = c(0.14, 0.885), legend.background = theme_rect(fill = 'white')) Dennis On Fri, Oct 21, 2011 at 11:57 AM, Michael Friendly frien...@yorku.ca wrote: Thanks very much, Dennis.  See below for something I don't understand. On 10/21/2011 12:15 PM, Dennis Murphy wrote: Hi Michael

Re: [R] Calculating difference between values in data frame based on separate column

2011-10-21 Thread Dennis Murphy
Here's another way, using the reshape2 package: library(reshape2) d - dcast(df, vial ~ measure, value_var = 'value') d$diff - with(d, B - A) d vial A B diff 11 12 26 14 22 30 45 15 33 27 325 44 6 34 28 HTH, Dennis On Fri, Oct 21, 2011 at 3:31 PM, Nathan Miller

Re: [R] Aggregating data help

2011-10-20 Thread Dennis Murphy
Hi: Here's a way using the reshape2 package. library('reshape2') rsub - subset(rtest, concept %in% c('8.2.D', '8.3.A', '8.3.B')) # want year ahead of concept in the variable list rsub - rsub[, c(1:4, 9, 5:8)] cast(rsub, id + test + subject + grade + year ~ concept, value_var = 'per_corr') #

Re: [R] Scatterplot with the 3rd dimension = color?

2011-10-20 Thread Dennis Murphy
AFAIK, you can't 'add' two ggplot2 graphs together; the problem in this case is that the two color scales would clash. If you're willing to discretize the z values, then you could pull it off. Here's an example: d - data.frame(x = rnorm(100), y = rnorm(100), z = factor(1 + (rnorm(100) 0))) d1 -

Re: [R] Calculating differences

2011-10-20 Thread Dennis Murphy
Hi: Here's one way with the plyr package. Using ds as the name of your data frame (thank you for the dput and clear description of what you wanted, BTW), library('plyr') ddply(ds, .(date), mutate, minspd = min(speed), Cmin = C[which.min(speed)], diff = C - Cmin) speed C house date hour

Re: [R] Help Reformatting a Data table

2011-10-19 Thread Dennis Murphy
Here's one way to do it using the reshape package: library('reshape') cast(A, YEAR ~ TAX, value = 'NUMBER', fill = 0) YEAR A B 1 2000 2 3 2 2001 2 4 3 2002 3 3 4 2003 1 0 5 2004 0 2 HTH, Dennis On Tue, Oct 18, 2011 at 7:51 PM, Michael E. Steiper michaelstei...@gmail.com wrote: Hi, I am a

Re: [R] help with x axis on lattice barchart--R-Beginner

2011-10-19 Thread Dennis Murphy
Hi: Your example data don't exhibit the types of behavior that you're concerned about if you keep it simple. Here's how you can get the *example* data plotted in both lattice and ggplot2; if the example data are a subset of a much larger data set, then see below. This 'works' on the example data

Re: [R] Detect and replace omitted data

2011-10-18 Thread Dennis Murphy
Prompted by David's xtabs() suggestion, one way to do what I think the OP wants is to * define day and unit as factors whose levels comprise the full range of desired values; * use xtabs(); * return the result as a data frame. Something like x - data.frame( day = factor(rep(c(4, 6), each = 8),

Re: [R] Ordering of stack in ggplot (package ggplot2)

2011-10-18 Thread Dennis Murphy
Hi: levels(df.m2$Region) [1] Africa Americas Asia Europe Oceania Reorder your Region factor to the following: df.m2$Region - factor(df.m2$Region, levels = c('Europe', 'Asia', 'Americas', 'Africa', 'Oceania')) Then recopy the code from the definition of a

Re: [R] hypothetical prediction after polr

2011-10-18 Thread Dennis Murphy
Hi: I think the problem is that you're trying to append the predicted probabilities as a new variable in the (one-line) data frame, when in fact a vector of probabilities is output ( = number of ordered levels of the response) for each new observation. Here's a reproducible example hacked from

Re: [R] Converting list of lists into dataframes

2011-10-17 Thread Dennis Murphy
In the absence of a reproducible example (your example is not reproducible as is), try this: names(help.me) - as.character(2:4) library('plyr') newDF - ldply(help.me, rbind) newDF[['.id']] - as.numeric(newDF[['.id']]) ldply will create a new column named .id that contains the name of the list

Re: [R] question: ragged array

2011-10-16 Thread Dennis Murphy
Hi: Try this: ratok - data.frame(Id = rep(1:3, 3:1), value = c(2, 3, 4, 2, 1, 5)) aggregate(value ~ Id, data = ratok, FUN = mean) Id value 1 1 3.0 2 2 1.5 3 3 5.0 aggregate() returns a data frame with the Id variable and mean(value). HTH, Dennis On Sun, Oct 16, 2011 at 6:53 AM,

Re: [R] ecdf

2011-10-16 Thread Dennis Murphy
Hi: I don't understand what you're attempting to do. Wouldn't courseid be a categorical variable with a numeric label? If that is so, why are you trying to compute an EDF? An EDF computes cumulative relative frequency of a random variable, which by definition is numeric. If we were talking about

Re: [R] ecdf

2011-10-16 Thread Dennis Murphy
, 2011 at 11:02 PM, David Winsemius dwinsem...@comcast.net wrote: On Oct 16, 2011, at 3:53 PM, Dennis Murphy wrote: Hi: I don't understand what you're attempting to do. Wouldn't courseid be a categorical variable with a numeric label? If that is so, why are you trying to compute an EDF

Re: [R] Split a list

2011-10-14 Thread Dennis Murphy
Hi: Following the lead of others, here's a reproducible example that I believe achieves what you want. # Q1: L - lapply(1:3, function(n) data.frame(x = rnorm(6), y = rnorm(6), g = rep(1:2, each = 3))) # Using David's suggestion: L1 - lapply(L, function(d) subset(d, g == 1L)) L2 - lapply(L,

Re: [R] Applying function to only numeric variable (plyr package?)

2011-10-12 Thread Dennis Murphy
Hi: One approach to this problem in plyr is to use the recently developed mutate() function rather than ddply(). mutate() is a somewhat faster version of transform(); when used as a standalone function, it doesn't take a grouping variable as an argument. For this example, one could use

Re: [R] controling text in facets (ggplot2)

2011-10-11 Thread Dennis Murphy
In the absence of a reproducible example, a general question induces a general response. I'd suggest creating a small data frame that contains the x and y coordinates, a third variable consisting of expressions representing each fitted model and an indicator of the group to which the expression is

Re: [R] get the sorted index of elements within a column

2011-10-11 Thread Dennis Murphy
m - matrix(rpois(16, 10), ncol = 4) m [,1] [,2] [,3] [,4] [1,]9 128 11 [2,] 127 118 [3,] 12788 [4,] 11 1148 apply(m, 2, sort) [,1] [,2] [,3] [,4] [1,]9748 [2,] 11788 [3,] 12 1188 [4,] 12 12

Re: [R] calculate multiple means of one vector

2011-10-10 Thread Dennis Murphy
Hi: Here's one approach: dat - rnorm(40, 0, 2) positions - matrix(c(3, 4, 5, 8, 9, 10, 20, 21, 22, 30, 31, 32), ncol = 3, byrow = TRUE) # Subdata t(apply(positions, 1, function(x) dat[x])) [,1] [,2] [,3] [1,] 0.5679765 1.429396 2.9050931

Re: [R] Superposing mean line to xyplot

2011-10-10 Thread Dennis Murphy
Hi: Here's one way to do it, adding the latticeExtra package: array = rep(c(A,B,C),each = 36) # array replicate spot = rep(1:4,27) # miRNA replicate on each array miRNA = rep(rep(paste(miRNA,1:9,sep=.),each=4),3) # miRNA label exprs = rnorm(mean=2.8,n = 108) # intensity dat =

Re: [R] SLOW split() function

2011-10-10 Thread Dennis Murphy
I tried this: library(data.table) N - 1000 T - N*10 d - data.table(gp= rep(1:T, rep(N,T)), val=rnorm(N*T), key = 'gp') dim(d) [1] 10002 # On my humble 8Gb system, system.time(l - d[, split(val, gp)]) user system elapsed 4.150.094.27 I wouldn't be surprised if

Re: [R] Expand dataframe according to limits defined per row

2011-10-07 Thread Dennis Murphy
Here's one way to do it with the plyr package: library('plyr') f - function(df) with(df, data.frame(B = B, E = seq(C, D))) ddply(d, 'A', f) A corresponding solution with the data.table package would be library('data.table') dt - data.table(d, key = 'A') dt[, list(B, E = seq(C, D)), by = 'A']

Re: [R] About stepwise regression problem

2011-10-06 Thread Dennis Murphy
Hi: Please read this: http://www.stata.com/support/faqs/stat/stepwise.html Dennis On Thu, Oct 6, 2011 at 6:41 AM, pigpigmeow gloryk...@hotmail.com wrote: using AIC/BIC, I'm not know too much about this. I just know using p-value to perform stepwise regression if I used p-value to perform

Re: [R] Wide to long form conversion

2011-10-06 Thread Dennis Murphy
Hi: I have some data 'myData' in wide form (attached at the end), and would like to convert it to long form. I wish to have five variables in the result: 1) Subj: factor 2) Group: between-subjects factor (2 levels: s / w) 3) Reference: within-subject factor (2 levels: Me / She) 4) F:

Re: [R] mean of 3D arrays

2011-10-05 Thread Dennis Murphy
Hi: There are a few ways to do this. If you only have a few arrays, you can simply add them and divide by the number of arrays. If you have a large number of such arrays, this is inconvenient, so an alternative is to ship the arrays into a list and use the Reduce() function. For your example, L

Re: [R] Subsetting a data frame with multiple values and exclusions.

2011-10-05 Thread Dennis Murphy
Hi: Is this what you're after? f - function(x) !any(x %in% terms_exclude) any(x %in% terms_include) db[apply(db[, -1], 1, f), ] ind test1 test2 test3 2 ind2 227 28.0 4 ind4 3 2 1.2 HTH, Dennis On Wed, Oct 5, 2011 at 8:53 AM, natalie.vanzuydam nvanzuy...@gmail.com

Re: [R] Needed help with 3 factor anova !!!

2011-10-05 Thread Dennis Murphy
Try Googling 'Three factor ANOVA R'; it didn't take long to find a few relevant hits. Dennis On Wed, Oct 5, 2011 at 10:56 AM, rafal rafalpedzi...@gmail.com wrote: I am a student from Poland. What I am interested in is 3 factor anova with R. Could you please help me find an example with using

Re: [R] Display a contingency table on the X11 device

2011-10-05 Thread Dennis Murphy
Hi: One option is the gridExtra package - run the example associated with the tableGrob() function. Another is the addtable2plot() function in the plotrix package. I'm pretty sure there's at least one other package that can do this; I thought it was in the gplots package, but couldn't find one

Re: [R] Display a contingency table on the X11 device

2011-10-05 Thread Dennis Murphy
Thanks, Baptiste. I was looking for tableplot() or something like it and thought textplot() was doing something different. Appreciate the correction. Dennis On Wed, Oct 5, 2011 at 1:29 PM, baptiste auguie baptiste.aug...@googlemail.com wrote: On 6 October 2011 09:23, Dennis Murphy djmu

Re: [R] aggregate function with a dataframe for both x and by

2011-10-05 Thread Dennis Murphy
Hi: It's a little tricky to read in a data frame 'by hand' without making NA a default missing value; you've got to trick it a bit. I'm doing this inefficiently, but if you have the two 'real' data sets stored in separate files, read.table() is the way to go since it provides an option for

Re: [R] counts in quantiles in and from a matrix

2011-10-05 Thread Dennis Murphy
Hi: Here's one way: m - matrix(rpois(100, 8), nrow = 5) f - function(x) { q - quantile(x, c(0.1, 0.9), na.rm = TRUE) c(sum(x q[1]), sum(x q[2])) } t(apply(m, 1, f)) HTH, Dennis On Wed, Oct 5, 2011 at 8:11 PM, Ben qant ccqu...@gmail.com wrote: Hello, I'm trying to get the

Re: [R] ggplot2: expression() in legend labels?

2011-10-04 Thread Dennis Murphy
Hi: Here's a reproducible example: d - data.frame(grp = factor(rep(c('x', 'y'), each = 5)), ev = rnorm(10), dv = rnorm(10)) labl - list(expression(italic('x')), expression(italic('y'))) ggplot(d, aes(x = ev, y = dv, shape = grp)) + geom_point() + scale_shape_manual('Group',

Re: [R] adding a dummy variable...

2011-10-04 Thread Dennis Murphy
Hi: Here's another way to do it with the plyr package, also not terribly elegant. It assumes that rel.head is a factor in your original data frame: str(df) 'data.frame': 11 obs. of 2 variables: $ ID : Factor w/ 6 levels 17100,17101,..: 1 1 2 3 4 4 5 5 5 6 ... $ rel.head: Factor w/ 3

Re: [R] Question about ggplot2 and stat_smooth

2011-10-04 Thread Dennis Murphy
Hi: The smooth is not going to replicate the quantile estimates you get from the 'boxplots'; the smooth is estimating a conditional mean using loess, with confidence limits associated with uncertainty in the estimate of the conditional mean function, which are almost certainly going to be

Re: [R] Question about ggplot2 and stat_smooth

2011-10-04 Thread Dennis Murphy
Hi Hadley: When I tried your function on the example data, I got the following: dd - data.frame(year = rep(2000:2008, each = 500), y = rnorm(4500)) g - function(df, qs = c(.05, .25, .50, .75, .95)) { data.frame(q = qs, quantile(d$y, qs)) } ddply(dd, .(year), g) ddply(dd, .(year), g) year

Re: [R] F-values in nested designs

2011-10-04 Thread Dennis Murphy
Hi: INB4: if I have a nested design with treatment A and treatment B within A, F-values are MSA/MSA(B) and MSA(B)/MSE, correct? How can I make R give these values directly, without further coding? This is how to get an equivalent model in lme4, but it probably isn't what you expect

Re: [R] ggplot2: changing default colors of boxplot

2011-10-04 Thread Dennis Murphy
Hi: Try this: p - ggplot(mtcars, aes(factor(cyl), mpg)) p + geom_boxplot(aes(colour = factor(am)), fill = 'white') + scale_colour_manual('am', values = c('0' = 'blue', '1' = 'black')) HTH, Dennis On Tue, Oct 4, 2011 at 1:56 PM, Brian Smith bsmith030...@gmail.com wrote: Hi, I wanted to

Re: [R] Assigning factor names to interaction plot

2011-10-03 Thread Dennis Murphy
Hi: A small toy example: fakedata - data.frame(group = factor(rep(1:3, each = 10), labels = paste('Therapy', 1:3)), city = factor(rep(c('Amsterdam', 'Rotterdam'), each = 5)), pressure = rnorm(30)) with(fakedata, interaction.plot(group, city, pressure,

Re: [R] Question about ggplot2 and stat_smooth

2011-10-03 Thread Dennis Murphy
Hi: I would think that, at least in principle, this should work: a - ggplot(mtcars, aes(qsec, wt)) a + geom_point() + stat_smooth(fill=blue, colour=darkblue, size=2, level = 0.9, alpha = 0.2) + stat_smooth(fill = 'blue', colour = 'darkblue', size = 2,

Re: [R] error using ddply to generate means

2011-10-01 Thread Dennis Murphy
Hi: Here's the problem: str(fun3) 'data.frame': 4 obs. of 3 variables: $ sector:'data.frame': 4 obs. of 1 variable: ..$ gics_sector_name: chr Financials Financials Materials Materials $ bebitpcchg: num -0.567 0.996 NA -42.759 $ ticker: chr UBSN VX Equity LLOY LN Equity

Re: [R] Sum of Probabilities in a matrix...

2011-10-01 Thread Dennis Murphy
Let's make it a data frame instead: # Read the data from your post into a data frame named d: d - read.table(textConnection( 0.98 2 0.2 1 0.01 2 0.5 1 0.6 6)) closeAllConnections() # Use the ave() function and append the result to d: d$sumprob - with(d, ave(V1, V2, FUN = sum))

Re: [R] For loop for subset - repeating same over and over?

2011-09-30 Thread Dennis Murphy
Hi: This would be a lot easier to check with a reproducible example, but here's a simplified version of your problem: testd - data.frame(gps = rep(c(ADHALP,ADLCON,ADLARC,BDALAT,BDPARC), each = 15), trt = rep(LETTERS[1:3], each = 5),

Re: [R] Overlapping plot in lattice

2011-09-30 Thread Dennis Murphy
Hi: One way is to create a vector of pch values that you can pass into xyplot, e.g., dd - data.frame(x = 1:10, y = 1:10, pch = c(rep(1, 5), 16, rep(1, 4))) library('lattice') xyplot(y ~ x, data = dd, pch = dd$pch, col = 1) HTH, Dennis On Fri, Sep 30, 2011 at 12:01 AM, Kang Min

Re: [R] apply lm function to dataset split by two variables

2011-09-28 Thread Dennis Murphy
Hi: Here's one way to do it with the plyr package: dd - read.table(textConnection( yearsps cm w 200950 16 22 200950 17 42 200950 18 45 200951 15 45 200951 16 53 200951 17 73 201050 15

Re: [R] Restructuring data - unstack, reshape?

2011-09-26 Thread Dennis Murphy
Hi: Here's one approach using the reshape() function in base R: # Read in your data: d - read.table(textConnection( Candidate.IDSpecialty Office Score 110002 C London 47 110002 C East 48

Re: [R] Conditional Evaluation

2011-09-26 Thread Dennis Murphy
Hi: The problem is that in your example, you have unequal numbers of rows in B that match the 1's pattern in A[i, ]. The function below cycles through the rows of A and returns, for each row of A, the rows in B that have 1's in the same columns as A[i, ]. By necessity, this returns a list. Notice

Re: [R] selecting first row of a variable with long-format data

2011-09-25 Thread Dennis Murphy
Hi: The head() function is helpful here: (i) plyr::ddply() library('plyr') ddply(dat, .(id), function(d) head(d, 1)) id value 1 1 5 2 2 4 (ii) aggregate(): aggregate(value ~ id, data = dat, FUN = function(x) head(x, 1)) id value 1 1 5 2 2 4 The formula version of

Re: [R] create variables through a loop

2011-09-22 Thread Dennis Murphy
Hi: Here's one approach with package reshape2. I copied the same data frame 10 times to illustrate the idea, but as long as your data frames have the same structure (same variables, same dimension), this should work. library('plyr') library('reshape2') # Use your example data as the template:

Re: [R] Bivariate Scatter Plots with Lattice

2011-09-22 Thread Dennis Murphy
Hi: Question: Do you want 37 different panels with plots of quant vs. date by param, or two panels (one per chemical) with all 37 streams? If you only want two of the eight chemicals, I'd suggest using subset() to select out the pair you want and then redefine the param factor so that the subset

Re: [R] Bivariate Scatter Plots with Lattice

2011-09-22 Thread Dennis Murphy
I don't see the problem. AFAICT, you want plots of quant vs. sampdate by chemicals for each of the 37 streams. Essentially, you seem to want two superimposed time plots in each panel. A time plot is a scatterplot with a time-ordered (usually horizontal) axis. If you use ggplot2 or lattice to

Re: [R] How to transfer variable names to column names?

2011-09-20 Thread Dennis Murphy
Hi: Using Michael's example data, here's another approach: x = data.frame(x = 1:5) y = data.frame(y = 1:10) z = data.frame(z = 1:3) # Generate a vector to name the list components nms - c('x', 'y', 'z') # Combine data frames into a list L - list(x, y, z) # name them names(L) - nms # Use

Re: [R] Hourly data with zoo

2011-09-12 Thread Dennis Murphy
Hi Steven: How about this? d - rep(20110101,24) h - sprintf('%04d', seq(0, 2300, by = 100)) df - data.frame(LST_DATE = d, LST_TIME = h, data = rnorm(24, 0, 1)) df - transform(df, datetime = as.POSIXct(paste(LST_DATE, LST_TIME), format = '%Y%m%d %H%M')) library(zoo) X

Re: [R] On-line machine learning packages?

2011-09-12 Thread Dennis Murphy
http://cran.r-project.org/web/views/ Look for 'machine learning'. Dennis On Sun, Sep 11, 2011 at 11:33 PM, Jay josip.2...@gmail.com wrote: If the answer is so obvious, could somebody please spell it out? On Sep 11, 10:59 pm, Jason Edgecombe ja...@rampaginggeek.com wrote: Try this:

Re: [R] regression on data subsets in datafile

2011-09-12 Thread Dennis Murphy
Hi: Here's one approach: # date typo fixed in record 5 - changed 35 to 5 tC - textConnection( Subject Dateparameter1 bob 3/2/99 10 bob 4/2/99 10 bob 5/5/99 10 bob 6/27/99 NA bob 8/5/01 10 bob 3/2/02 10 steve 1/2/99 4 steve 2/2/00 7 steve 3/2/01 10 steve

Re: [R] Accessing graphs via url

2011-09-10 Thread Dennis Murphy
Hi: Start here: http://www.quantmod.com/examples/ HTH, Dennis On Fri, Sep 9, 2011 at 10:44 PM, veepsirtt veepsi...@gmail.com wrote: Hi I want to bring the graph from the site www.avasaram.com for the symbol GOOG here is the link

Re: [R] reshape data from long to wide format

2011-09-09 Thread Dennis Murphy
Hi: There are two ways to go about this, depending on what you want. # 1: Jean's output format: # I'm using the reshape package here with cast(), # but it should work similarly with dcast() in reshape2: cast(example, DATE ~ SENSOR, value = 'VALUE') DATE A B C D E F 1

Re: [R] Read a list of files into named R data.frames

2011-09-09 Thread Dennis Murphy
Hi: Hmm, these files look familiar...Lahman database? :) I have the 2010 set, so here's how I got this to work on my system: files - list.files(pattern = '*.csv') files [1] Allstar.csv AllstarFull.csv [3] Appearances.csv AwardsManagers.csv [5] AwardsPlayers.csv

Re: [R] Question about plot.mona {cluster}

2011-09-09 Thread Dennis Murphy
Hi: Here are a couple of ways to do this. The barplot() attempt is pretty close to the original banner plot except that the color is transparent to the (white) background. The ggplot2 graph is similar, but the code may be less familiar if you don't use the package. In the latter, the variable

Re: [R] ggplot2 freqpoly() layers..?

2011-09-08 Thread Dennis Murphy
Hi: You could try something like this: # rbind the two melted data frames: comb - rbind(mat2, tab2) # create a group factor to distinguish the two data sets comb - mutate(comb, gp = factor(rep(1:2, each = 1))) # Generate the plot: ggplot(comb) + geom_freqpoly(aes(x = value, y =

Re: [R] problem with math expressions in grid graphics when using line breaks (\n)

2011-09-08 Thread Dennis Murphy
From the plotmath help page: Control characters (e.g. \n) are not interpreted in character strings in plotmath, unlike normal plotting. In other words, you can't do that. However, since you're using grid graphics, this thread from the ggplot2 group might be helpful:

Re: [R] Spider (Radar) Plot

2011-09-08 Thread Dennis Murphy
# install.packages('sos') library(sos) findFn('radar plot') gets 29 hits on my system. The two that seem to be the most relevant are radarchart() from the fmsb package and, as mentioned in the other reply, stars() from the autoloaded graphics package. BTW, read the quoted text and observe how

Re: [R] Problem with by statement for spaghetti plots

2011-09-07 Thread Dennis Murphy
Hi: Your code doesn't work, but if you want to do spaghetti plots, they're rather easy to do in ggplot2 or lattice. The first problem you have is that one group in your test data has all NA responses, so by() or any other groupwise summarization function is going to choke unless you give it a

Re: [R] Problem with by statement for spaghetti plots

2011-09-07 Thread Dennis Murphy
~ age, data = w, group = .id, panel = function(x, y, groups, ...) { panel.xyplot(x, y, groups = groups, type = 'l', col = 1, ...) panel.lmline(x, y, col = 'blue', lwd = 2) } ) Dennis On Wed, Sep 7, 2011 at 2:07 PM, Dennis Murphy djmu

Re: [R] ggplot2-Issue placing error bars behind data points

2011-09-07 Thread Dennis Murphy
Hi: For your test data, try this: # Result of dput(NerveSurv) NerveSurv - structure(list(Time = c(0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 0, 0.25, 0.5, 0.75, 1, 1.25, 1.5 ), SAP = c(1, 1.04, 1.04, 1.06, 1.04, 1.22, 1.01, 1, 1.01, 1.01, 1.06, 1.01, 0.977, 0.959, 1,

Re: [R] how to create data.frames from vectors with duplicates

2011-09-07 Thread Dennis Murphy
Hi: Here are a few informal timings on my machine with the following example. The data.table package is worth investigating, particularly in problems where its advantages can scale with size. library(data.table) dt - data.table(x = sample(1:50, 100, replace = TRUE), y =

Re: [R] problem in applying function in data subset (with a level) - using plyr or other alternative are also welcome

2011-09-03 Thread Dennis Murphy
Hi: I tried to figure out what you were doing...some of it I think I grasped, other parts not so much. On Fri, Sep 2, 2011 at 8:18 PM, Maya Joshi maya.d.jo...@gmail.com wrote: Dear R experts. I might be missing something obvious. I have been trying to fix this problem for some weeks. Please

Re: [R] Background fill and border for a legend in dotplot

2011-09-02 Thread Dennis Murphy
Hi: Try this: key1 - draw.key(list(text=list(levels(Cal_dat$Commodity)), title=Ore type, border = TRUE, background = 'ivory', points=list(pch=22, cex=1.3, fill=col.pat, col=black)), draw = FALSE) key2 -

Re: [R] merge some columns

2011-09-02 Thread Dennis Murphy
Hi: Here's one approach: d - read.table(textConnection( V1 V2 V3 V4 V5 V6 1 G A G G G G 2 A A G A A G), header = TRUE, stringsAsFactors = FALSE) closeAllConnections() # Create two vectors of variable names, one for odd numbered, # one for even numbered vars1 -

Re: [R] vector output loop or function

2011-09-01 Thread Dennis Murphy
Hi: Here's one approach: X1 - sample(1:4, 10, replace = TRUE, prob = c(0.4, 0.2, 0.2, 0.2)) foo - function(x) { m - matrix(NA, nrow = length(x), ncol = length(x)) m[, 1] - x idx - seq_len(length(x)) for(j in idx[-1]) { k - sample(idx, 2) x - replace(x, k, 5) m[, j]

Re: [R] ggplot2 to create a square plot

2011-09-01 Thread Dennis Murphy
From: Dennis Murphy djmu...@gmail.com To: Alaios ala...@yahoo.com Cc: R-help@r-project.org R-help@r-project.org Sent: Wednesday, August 31, 2011 9:34 PM Subject: Re: [R] ggplot2 to create a square plot Hi: I'd suggest using ggsave(); in particular, see its height

Re: [R] Fitting my data to a Weibull model

2011-08-31 Thread Dennis Murphy
Hi: Things work if x is the response and y is the covariate. To use the approach I describe below, you need RStudio and its manipulate package (which is only available in RStudio - you won't find it on CRAN). You can download and install RStudio freely from http://rstudio.org/ ; it is available

Re: [R] ggplot2 to create a square plot

2011-08-31 Thread Dennis Murphy
Hi: I'd suggest using ggsave(); in particular, see its height = and width = arguments. If you have some time, you could look at some examples of ggplot2 themes: https://github.com/hadley/ggplot2/wiki/themes and some examples of how to use various opts():

Re: [R] Multivariate Normal: Help wanted!

2011-08-30 Thread Dennis Murphy
Hi: It's polite to mention from which package you extract certain functions. For example, the function dmvnorm() exists in at least the following packages (following a search from the sos package): mixtools, emdbook, klaR and mvtnorm, not to mention related functions in three other packages.

Re: [R] Legend / bar order - ggplot2

2011-08-29 Thread Dennis Murphy
Hi: The bars *are* ordered in the same way, but when you use coord_flip(), the left category goes on top and the right category goes on the bottom. Is this what you want? ggplot(df, aes(x = name, y = value, fill = type)) + geom_bar(position = position_dodge()) + coord_flip() +

Re: [R] maximum number of subdivisions reached

2011-08-29 Thread Dennis Murphy
Hi: integrate() is not a vectorized function. This appears to work: sapply(1:2, function(x) func(x, 0.1, 0.1, sad = Exp)) [1] 0.250 0.125 In this case, sapply() is a disguised for loop. HTH, Dennis On Mon, Aug 29, 2011 at 9:45 AM, . . xkzi...@gmail.com wrote: Ooops, sorry! The problem

Re: [R] reading tables from multiple HTML pages

2011-08-29 Thread Dennis Murphy
?tryCatch HTH, Dennis On Mon, Aug 29, 2011 at 9:04 AM, s1oliver s1oli...@ucsd.edu wrote: Hi, beginner to R and was having some problems scraping data from tables in html using the XML package. I have included some code below. I am trying to loop through a series of html pages, each of which

Re: [R] splitting into multiple dataframes and then create a loop to work

2011-08-29 Thread Dennis Murphy
Hi: This is straightforward to do with the plyr package: # install.packages('plyr') library('plyr') set.seed(1234) df - data.frame(clvar = rep(1:4, each = 10), yvar = rnorm(40, 10, 6), var1 = rnorm(40, 10, 4), var2 = rnorm(40, 10, 4), var3 = rnorm(40, 5, 2),

Re: [R] splitting into multiple dataframes and then create a loop to work

2011-08-29 Thread Dennis Murphy
Hi: Dimitris' solution is appropriate, but it needs to be mentioned that the approach I offered earlier in this thread differs from the lmList() approach. lmList() uses a pooled measure of error MSE (which you can see at the bottom of the output from summary(mlis) ), whereas the plyr approach

Re: [R] to represent color range on plot segment

2011-08-27 Thread Dennis Murphy
On Sat, Aug 27, 2011 at 11:07 AM, karthicklakshman karthick.laksh...@gmail.com wrote: Dear R community, With an advantage of being NEW to R, I would like to post a very basic query here, Really? I found two posts with your name on it dating from October and November of 2010.

Re: [R] cbind giving NA's?

2011-08-26 Thread Dennis Murphy
Hi: Try this: require('xts') merge.zoo(zoo(a), zoo(b), all = c(TRUE, TRUE)) ZWD.UGX SCHB.Close 2010-03-31 NA 28.02 2010-04-01 7.6343 NA 2010-04-02 7.6343 NA 2010-04-03 7.5458 NA 2010-04-04 7.4532 28.30 2010-04-05 7.4040 28.38

Re: [R] how to read a group of files into one dataset?

2011-08-25 Thread Dennis Murphy
Hi: Similar in vein to the other respondents, you could try something like this: On Thu, Aug 25, 2011 at 1:17 AM, Jie TANG totang...@gmail.com wrote: for example : I have files with the name  ma01.dat,ma02.dat,ma03.dat,ma04.dat,I want to read the data in these files into one data.frame #

<    1   2   3   4   5   6   7   8   9   10   >