[R] Reassign values based on multiple conditions
Hi all, I have a simple data frame of three columns - one of numbers (really a categorical variable), one of dates and one of data. Imagine: collar date data 1 01/01/2013 x 2 02/01/2013 y 3 04/01/2013 z 4 04/01/2013 a 5 07/01/2013 b The 'collar' is a GPS collar that's been worn by an animal for a certain amount of time, and then may have been worn by a different animal after changes when the batteries needed to be changed. When an animal was caught and the collar battery needed to be changed, a whole new collar had to be put on, as these animals (wild boar and red deer!) were not that easy to catch. In order to follow the movements of each animal I now need to create a new column that assigns the 'data' by animal rather than by collar. I have a table of dates, e.g animal collar start_dateend_date 1 1 01/01/2013 03/01/2013 1 5 04/01/2013 06/01/2013 1 3 07/01/2013 09/01/2013 2 2 01/01/2013 03/01/2013 2 1 04/01/2013 06/01/2013 I have so far been able to make multi-conditional tests: animal1test- (date=01/01/13 date=03/01/13) animal1test2- (date=04/01/13 date=06/01/13) animal2test- (date=04/01/13 date=06/01/13) to use in an 'if else' formula: if(animal1test){ collar[1]=animal1 } else if(animal1test2){ collar[5]=animal1 }else if(animal2test) collar[1]=animal2 }else NA As I'm sure you can see, this is completely inelegant, and also not working for me! Any ideas on how to a achieve this? Thanks SO much in advance, Cat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] column and line graphs in R
When you send data, use dput() to send them. It is much more easy for people who want to help you. Here is an example. I am not sure if it is what you want but you can play with the code. Sincerely Marc fungal - structure(list(rel.abund = c(0.003, 0.029, 0.033, 0.023, 0.009, 0.042, 0.069, 0.059, 0.034, 0.049, 0.084, 0.015, 0.059, 0.032, 0.142, 0.031, 0.034, 0.01, 0.011, 0.004, 0.034, 0.182), rel.freq = c(0.083, 0.167, 0.167, 0.083, 0.083, 0.25, 0.083, 0.167, 0.083, 0.083, 0.333, 0.083, 0.083, 0.167, 0.25, 0.083, 0.083, 0.083, 0.083, 0.083, 0.333, 0.417)), .Names = c(rel.abund, rel.freq), class = data.frame, row.names = c(MOTU2, MOTU4, MOTU6, MOTU7, MOTU9, MOTU11, MOTU14, MOTU16, MOTU17, MOTU18, MOTU19, MOTU20, MOTU21, MOTU22, MOTU23, MOTU24, MOTU25, MOTU29, MOTU30, MOTU33, MOTU36, MOTU34 )) premar - par(mar) par(mar=c(5,4,4,4)+0.1) plot(fungal[,1], type=h, lwd=20, lend=2, bty=n, xlab=, ylab=Relative abundance, xaxt=n, ylim=c(0,0.2)) par(xpd=TRUE) segments(-2.5, 0.01, -2.5, 0.03, lwd=20, lend=2, col=black) par(new=TRUE) plot(fungal[,2], type=p, bty=n, pch=16, col=red, axes=FALSE, xlab=, ylab=, main=, ylim=c(0,0.5)) axis(1, at=1:length(rownames(fungal)), labels=rownames(fungal), las=2) axis(4) mtext(Relative frequency, side=4, line=3) points(25.6, 0.1, pch=16, col=red) par(mar=premar) Le 14/03/13 15:40, Gian Maria Niccolò Benucci a écrit : Hi again, Thank you all for your support. I would love to have a graph in which two variables are contemporary showed. For example a histogram and a curve should be the perfect choice. I tried to use twoord.plot() but I am not sure I understand how to manage the the arguments lx, ly, rx, ry... Anyway these are my data: nat_af rel.abund rel.freq MOTU2 0.0030.083 MOTU4 0.0290.167 MOTU6 0.0330.167 MOTU7 0.0230.083 MOTU9 0.0090.083 MOTU11 0.0420.250 MOTU14 0.0690.083 MOTU16 0.0590.167 MOTU17 0.0340.083 MOTU18 0.0490.083 MOTU19 0.0840.333 MOTU20 0.0150.083 MOTU21 0.0590.083 MOTU22 0.0320.167 MOTU23 0.1420.250 MOTU24 0.0310.083 MOTU25 0.0340.083 MOTU29 0.0100.083 MOTU30 0.0110.083 MOTU33 0.0040.083 MOTU36 0.0340.333 MOTU34 0.1820.417 First column is the relative abundance of the given MOTU and second column is the relative frequency of the same MOTU. Thank you very much in advance, -- __ Marc Girondot, Pr Laboratoire Ecologie, Systématique et Evolution Equipe de Conservation des Populations et des Communautés CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079 Bâtiment 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1.69.15.73.53 e-mail: marc.giron...@u-psud.fr Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html Skype: girondot __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Difficulty with UNIQUE
I need to extract labels from Excel input data to use as dimnames later on. I can successfully read the Excel data into three matrices: capacity - read.csv(c:\\R\\data\\capacity.csv) price.lookup - read.csv(c:\\R\\data\\price lookup.csv) sales - read.csv(c:\\R\\data\\sales.csv) The values to be used as dimnames are duplicated in the matrices. For example, I would like to create dimnames(out.table) [[3]] - c(a, b, c) by not explicitly entering the first three letters of the alphabet but by something like dimnames(out.table) [[3]] - pl.names but I cannot generate unique values with pl.names - unique(with(price.lookup, list(Price_Line))) pl.names [[1]] [1] a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a c c c c c c c [44] c c c c c c c c c c c c c c c c c c c c c c c c c c c c c b b b b b b b b b b b b b b [87] b b b b b b b b b b b b b b b b b b b b b b Levels: a b c Can someone please suggest how I can grab a, b, c from (with(price.lookup, list(Price_Line)) ? Thank you, -- __ *Barry E. King, Ph.D.* Director of Retail Operations Qualex Consulting Services, Inc. barry.k...@qlx.com O: (317)940-5464 M: (317)507-0661 __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] phyper returning zero
Hi, I am attempting to use phyper to test the significance of two overlapping lists. I keep getting a zero and wondered if that was determining non-significance of my overlap or a p-value too small to calculate? overlap = 524 lista = 2784 totalpop = 54675 listb = 1296 phyper(overlap, lista, totalpop, listb,lower.tail = FALSE, log.p=F) [1] 0 If I plug in some different values I get a p-value but since zero is actually lower is the overlap significant, or more likely have I made a mistake in using the function? phyper(10, 100, 2, 100,lower.tail = FALSE, log.p=F) [1] 2.582795e-12 Thanks Elliott This message has been scanned for malware by Websense. www.websense.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] column and line graphs in R
Thank you very much to you all, I'll play the codes and post my code once I have tested it. Cheers, -- Gian On 14 March 2013 16:27, John Kane jrkrid...@inbox.com wrote: The easiest way to supply data is to use the dput() function. Example with your file named testfile: dput(testfile) Then copy the output and paste into your email. For large data sets, you can just supply a representative sample. Usually, dput(head(testfile, 100)) will be sufficient. Generally speaking two y-axis scales are to be avoided if at all possible. Faceting is likely to give you better results although I see that the scale differences are annoying large. It is possible to plot the two facets of the graph independently in order to have two independent y-axes but it takes more work and may or may not be needed Here is a possible approach based on ggplot2 . You will probably have to install ggplot2 and reshape2 using install.packages() Notice I've changed your variable names around and turned your data into a dataframe with the matrix row.names as another variable. ##===begin code==# library(reshape2) library(ggplot2) dat1-read.table(text= place abund freq MOTU2 0.0030.083 MOTU4 0.0290.167 MOTU6 0.0330.167 MOTU7 0.0230.083 MOTU9 0.0090.083 MOTU11 0.0420.250 MOTU14 0.0690.083 MOTU16 0.0590.167 MOTU17 0.0340.083 MOTU18 0.0490.083 MOTU19 0.0840.333 MOTU20 0.0150.083 MOTU21 0.0590.083 MOTU22 0.0320.167 MOTU23 0.1420.250 MOTU24 0.0310.083 MOTU25 0.0340.083 MOTU29 0.0100.083 MOTU30 0.0110.083 MOTU33 0.0040.083 MOTU36 0.0340.333 MOTU34 0.1820.417 ,sep=,header=TRUE,stringsAsFactors=FALSE) str(dat1) dm1 - melt(dat1, id = place, variable.name=type, value.name=freq) str(dm1) # plot first alternative ggplot(dm1, aes(place, freq, colour = type, group = type )) + geom_line(group = 1) + facet_grid(type ~ . ) # or plot second alternative. ggplot(dm1, aes(place, freq, colour = type, group = type )) + geom_line(group = 1) + facet_grid(. ~ type ) ##end code===# -Original Message- From: gian.benu...@gmail.com Sent: Thu, 14 Mar 2013 15:40:53 +0100 To: r-help@r-project.org Subject: Re: [R] column and line graphs in R Hi again, Thank you all for your support. I would love to have a graph in which two variables are contemporary showed. For example a histogram and a curve should be the perfect choice. I tried to use twoord.plot() but I am not sure I understand how to manage the the arguments lx, ly, rx, ry... Anyway these are my data: nat_af rel.abund rel.freq MOTU2 0.0030.083 MOTU4 0.0290.167 MOTU6 0.0330.167 MOTU7 0.0230.083 MOTU9 0.0090.083 MOTU11 0.0420.250 MOTU14 0.0690.083 MOTU16 0.0590.167 MOTU17 0.0340.083 MOTU18 0.0490.083 MOTU19 0.0840.333 MOTU20 0.0150.083 MOTU21 0.0590.083 MOTU22 0.0320.167 MOTU23 0.1420.250 MOTU24 0.0310.083 MOTU25 0.0340.083 MOTU29 0.0100.083 MOTU30 0.0110.083 MOTU33 0.0040.083 MOTU36 0.0340.333 MOTU34 0.1820.417 First column is the relative abundance of the given MOTU and second column is the relative frequency of the same MOTU. Thank you very much in advance, -- Gian On 14 March 2013 14:51, John Kane jrkrid...@inbox.com wrote: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example You really need to read the posting guide and supply some sample data at the very least. Here is about as simple minded a plot as R will do as an example however dat1 - structure(list(abond = c(17L, 3L, 6L, 11L, 5L, 8L, 13L, 16L, 15L, 2L), freq = c(17L, 14L, 7L, 13L, 19L, 5L, 3L, 20L, 9L, 10L )), .Names = c(abond, freq), row.names = c(NA, -10L), class = data.frame) plot(dat1$abond, col = red) lines(dat1$freq, col= blue) John Kane Kingston ON Canada -Original Message- From: gian.benu...@gmail.com Sent: Thu, 14 Mar 2013 11:05:40 +0100 To: r-help@r-project.org Subject: [R] column and line graphs in R Hi all, I would love to plot my data with R. I have abundance and frequency of fungal taxonomic data that should be plotted in the same graph. In Microsoft Excel is that possible but the graphic result is, as always, very poor. Is
Re: [R] Difficulty with UNIQUE
with(price.lookup, list(Price_Line)) is a list! Use unique(unlist(with(price.lookup, list(Price_Line -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Barry King Sent: Freitag, 15. März 2013 09:34 To: r-help@r-project.org Subject: [R] Difficulty with UNIQUE I need to extract labels from Excel input data to use as dimnames later on. I can successfully read the Excel data into three matrices: capacity - read.csv(c:\\R\\data\\capacity.csv) price.lookup - read.csv(c:\\R\\data\\price lookup.csv) sales - read.csv(c:\\R\\data\\sales.csv) The values to be used as dimnames are duplicated in the matrices. For example, I would like to create dimnames(out.table) [[3]] - c(a, b, c) by not explicitly entering the first three letters of the alphabet but by something like dimnames(out.table) [[3]] - pl.names but I cannot generate unique values with pl.names - unique(with(price.lookup, list(Price_Line))) pl.names [[1]] [1] a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a c c c c c c c [44] c c c c c c c c c c c c c c c c c c c c c c c c c c c c c b b b b b b b b b b b b b b [87] b b b b b b b b b b b b b b b b b b b b b b Levels: a b c Can someone please suggest how I can grab a, b, c from (with(price.lookup, list(Price_Line)) ? Thank you, -- __ *Barry E. King, Ph.D.* Director of Retail Operations Qualex Consulting Services, Inc. barry.k...@qlx.com O: (317)940-5464 M: (317)507-0661 __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] phyper returning zero
On Fri, Mar 15, 2013 at 8:52 AM, elliott harrison e.harri...@epistem.co.uk wrote: Hi, I am attempting to use phyper to test the significance of two overlapping lists. I keep getting a zero and wondered if that was determining non-significance of my overlap or a p-value too small to calculate? overlap = 524 lista = 2784 totalpop = 54675 listb = 1296 phyper(overlap, lista, totalpop, listb,lower.tail = FALSE, log.p=F) [1] 0 If you set log.p = T, you see that the _log_ of the desired value is -800, so it's likely simply too small to fit in a IEEE double. In sort, for all and any practical purposes, your p-value is zero. Cheers, MW __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] phyper returning zero
Thanks Michael I assumed as much but we know what that did. Thanks again. Elliott -Original Message- From: R. Michael Weylandt [mailto:michael.weyla...@gmail.com] Sent: 15 March 2013 09:29 To: elliott harrison Cc: r-help@r-project.org Subject: Re: [R] phyper returning zero On Fri, Mar 15, 2013 at 8:52 AM, elliott harrison e.harri...@epistem.co.uk wrote: Hi, I am attempting to use phyper to test the significance of two overlapping lists. I keep getting a zero and wondered if that was determining non-significance of my overlap or a p-value too small to calculate? overlap = 524 lista = 2784 totalpop = 54675 listb = 1296 phyper(overlap, lista, totalpop, listb,lower.tail = FALSE, log.p=F) [1] 0 If you set log.p = T, you see that the _log_ of the desired value is -800, so it's likely simply too small to fit in a IEEE double. In sort, for all and any practical purposes, your p-value is zero. Cheers, MW This message has been scanned for malware by Websense. www.websense.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] phyper returning zero
eh == elliott harrison e.harri...@epistem.co.uk on Fri, 15 Mar 2013 08:52:36 + writes: eh Hi, eh I am attempting to use phyper to test the significance eh of two overlapping lists. I keep getting a zero and eh wondered if that was determining non-significance of my eh overlap or a p-value too small to calculate? well what do you guess? (:-) eh overlap = 524 eh lista = 2784 eh totalpop = 54675 eh listb = 1296 eh phyper(overlap, lista, totalpop, listb,lower.tail = FALSE, log.p=F) eh [1] 0 Well, just *do* use log.p=TRUE : phyper(overlap, lista, totalpop, listb,lower.tail = FALSE, log.p=TRUE) [1] -800.0408 so, indeed P = exp(-800) which is smaller than the smallest positive number in double precision, which by the way is available in R as .Machine$double.xmin [1] 2.225074e-308 I'm pretty sure that I cannot think of a situation where it is important to know that the more exact probability is around 10^(-347.45) phyper(overlap, lista, totalpop, listb,lower.tail = FALSE, log.p=TRUE) / log(10) [1] -347.4533 rather than to know that it is very very very small. Martin eh If I plug in some different values I get a p-value but since zero is actually lower is the overlap significant, or more likely have I made a mistake in using the function? eh phyper(10, 100, 2, 100,lower.tail = FALSE, log.p=F) eh [1] 2.582795e-12 eh Thanks eh Elliott __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add a continuous color ramp legend to a 3d scatter plot
Le 14/03/13 18:15, Zhuoting Wu a écrit : I have two follow-up questions: 1. If I want to reverse the heat.colors (i.e., from yellow to red instead of red to yellow), is there a way to do that? nbcol - heat.colors(128) nbcol - nbcol[128:1] 2. I also created this interactive 3d scatter plot as below: library(rgl) plot3d(x=x, y=y, z=z, col=nbcol[zcol], box=FALSE) I have never use such a plot. Sorry Marc Is there any way to add the same legend to this 3d plot? I'm new to R and try to learn it. I'm very grateful for any help! thanks, Z -- __ Marc Girondot, Pr Laboratoire Ecologie, Systématique et Evolution Equipe de Conservation des Populations et des Communautés CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079 Bâtiment 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1.69.15.73.53 e-mail: marc.giron...@u-psud.fr Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html Skype: girondot __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot2, arrows and polar coordinates
Dear R users, The following issue has been already documented, but, if I am not mistaken, not yet solved. This issue appears while trying to plot arrows with geom_segment (package ggplot2), with polar coordinates (coord_polar). The direction of some arrows is wrong (red rectangle). Please find herewith an example. Does someone know how to deal with that issue? Best Regards, Pascal Oettli #-- # Example adapted from the help page of geom_segment library(ggplot2) library(grid) d - data.frame(x1=-135.3, x2=-158.3, y1=37.2, y2=45.2) p - ggplot(seals, aes(x = long, y = lat)) p1 - ggplot() + coord_cartesian() + geom_rect(data=d, mapping=aes(xmin=x1, xmax=x2, ymin=y1, ymax=y2), fill=red, color=red, alpha=0.5) + geom_segment(data=seals, aes(x = long, y = lat, xend = long + delta_long, yend = lat + delta_lat), arrow = arrow(length = unit(0.2,cm))) p2 - ggplot() + coord_polar() + geom_rect(data=d, mapping=aes(xmin=x1, xmax=x2, ymin=y1, ymax=y2), fill=red, color=red, alpha=0.5) + geom_segment(data=seals, aes(x = long, y = lat, xend = long + delta_long, yend = lat + delta_lat), arrow = arrow(length = unit(0.2,cm))) grid.newpage() pushViewport(viewport(layout = grid.layout(3, 2, heights = unit(c(0.5, 0.5, 5), null grid.text(Example taken from '?geom_segment', vp = viewport(layout.pos.row = 1, layout.pos.col = 1:2)) grid.text(Cartesian coordinates, vp = viewport(layout.pos.row = 2, layout.pos.col = 1)) grid.text(Polar coordinates, vp = viewport(layout.pos.row = 2, layout.pos.col = 2)) print(p1, vp = viewport(layout.pos.row = 3, layout.pos.col = 1)) print(p2, vp = viewport(layout.pos.row = 3, layout.pos.col = 2)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to list the all products' information of the latest month?
Hi, I have data frame like this: Product PriceYear_Month PE A 100201012 -2 A 98 201101-3 A 97 201102-2.5 B 110 201101-1 B 100 201102-2 B 90 201103-4 How can I achieve the following result using R: Product PriceYear_Month PE A 97 201102-2.5 B 90 201103-4 in other words, list the all products' information of the latest month? Thanks for your help. Kind regards, Lingyi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reviewer comment
Could someone explain me this sentence reviewer below in blod underlined, Authors should try to be more detailed in the description of analyses: some of the details reported in the Principal components analysis paragraph (Results) should be moved here. Because a highly_/*asymmetric distribution could affect Principal Component Analysis results, symmetry of distribution should be tested. Authors should also indicate if outliers were observed and consequently excluded because they could affect factors*/_ Any help would be greatly appreciated! Regards ML -- Mohamed Lajnef,IE INSERM U955 eq 15# P?le de Psychiatrie# H?pital CHENEVIER # 40, rue Mesly # 94010 CRETEIL Cedex FRANCE # mohamed.laj...@inserm.fr # tel : 01 49 81 32 79 # Sec : 01 49 81 32 90 # fax : 01 49 81 30 99 # [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] does R read commands from scripts instantanously or seuqently during processing
Dear all, thanks, Rolf and Jeff, for your replies. The command below runs under Suse Linux. I guess, hoewever, the phenomena I observed would heappen under other oprating systems as well. The reason why I asked was that R produced some error messages that did not really point me to the direction of the edited script file. These errors were usually something like: Error: unexpected symbol in cess finished. The line in the script which caused this error is: print(paste(as.character(Sys.time()), ': Process finished.', sep='')) This line contains valid R code and would normally not produce an error. Some testing showed that the error above only happens when I edit the code of the script while the script is run. So R probably reads in a script submitted that way seuqently directly while executing the individual commands. No idea though what happens if i would start the script via source inside R itself. Thanks again for your suggestions Jannis On 14.03.2013 22:47, Rolf Turner wrote: On 03/15/2013 05:13 AM, Jannis wrote: Dear R community, when I source a script into R via: R --slave scriptname.R is the whole script file read at once during startup or is each indivdual line of code read seqnetially during the execution (i.e. directly before r processes the respective command)? In other words, can I savely edit the scriptname.R file even when an active R process still runs the command above? Experiment. Build a toy script with a loop that never terminates. Set it going. Edit the script and change the code so that the loop terminates. See what happens. [It seems to me that nothing happens, so that you *can* safely edit the script while the process runs. But further experimentation would be advisable.] cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a hyperlink in a csv file
Hi, I was wondering if it is possible to create a hyperlink in a csv file using R code and some package. For example, in the following code: links - cbind(rep('Click for Google',3),http://www.google.com;) write.table(links,'test.csv',sep=',',row.names=F,col.names=F) the web address should be linked to 'Click for Google'. many thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to list the all products' information of the latest month?
Try this: x - read.table(text = Product PriceYear_Month PE + A 100201012 -2 + A 98 201101-3 + A 97 201102-2.5 + B 110 201101-1 + B 100 201102-2 + B 90 201103-4, header = TRUE, as.is = TRUE) do.call(rbind + , lapply(split(x, x$Product), tail, 1) + ) Product Price Year_Month PE A A97 201102 -2.5 B B90 201103 -4.0 On Fri, Mar 15, 2013 at 5:56 AM, Tammy Ma metal_lical...@live.com wrote: Hi, I have data frame like this: Product PriceYear_Month PE A 100201012 -2 A 98 201101-3 A 97 201102-2.5 B 110 201101-1 B 100 201102-2 B 90 201103-4 How can I achieve the following result using R: Product PriceYear_Month PE A 97 201102-2.5 B 90 201103-4 in other words, list the all products' information of the latest month? Thanks for your help. Kind regards, Lingyi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Writing a hyperlink to a csv file
Hi, I was wondering if it is possible to create a hyperlink in a csv file using R code and some package. For example, in the following code: links - cbind(rep('Click for Google',3),google search address goes here) ## R Mailing list blocks if I put the actual web address here write.table(links,'test.csv', sep=',',row.names=F,col.names=F) the web address should be linked to 'Click for Google'. many thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to list the all products' information of the latest month?
On 15-03-2013, at 10:56, Tammy Ma metal_lical...@live.com wrote: Hi, I have data frame like this: Product PriceYear_Month PE A 100201012 -2 A 98 201101-3 A 97 201102-2.5 B 110 201101-1 B 100 201102-2 B 90 201103-4 How can I achieve the following result using R: Product PriceYear_Month PE A 97 201102-2.5 B 90 201103-4 Another option is to use aggregate like this aggregate(x, by=list(x$Product), FUN=function(z) tail(z,1))[,-1] or aggregate(. ~ Product, data=x, FUN=function(z) tail(z,1)) Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] does R read commands from scripts instantanously or seuqently during processing
On 15/03/2013 10:40, Jannis wrote: Dear all, thanks, Rolf and Jeff, for your replies. The command below runs under Suse Linux. I guess, hoewever, the phenomena I observed would heappen under other oprating systems as well. The reason why I asked was that R produced some error messages that did not really point me to the direction of the edited script file. These errors were usually something like: Error: unexpected symbol in cess finished. The line in the script which caused this error is: print(paste(as.character(Sys.time()), ': Process finished.', sep='')) This line contains valid R code and would normally not produce an error. Some testing showed that the error above only happens when I edit the code of the script while the script is run. So R probably reads in a script submitted that way seuqently directly while executing the If reading from stdin, it does (like any other interpreter): however stdin is buffered if re-directed, so the input script is read in blocks from a file (the size of the block depending on the OS). individual commands. No idea though what happens if i would start the script via source inside R itself. R is Open Source, and you can read the code of source(). It really isn't hard to see that it parses the whole file, then executes the parsed expressions one at a time. Thanks again for your suggestions Jannis On 14.03.2013 22:47, Rolf Turner wrote: On 03/15/2013 05:13 AM, Jannis wrote: Dear R community, when I source a script into R via: R --slave scriptname.R is the whole script file read at once during startup or is each indivdual line of code read seqnetially during the execution (i.e. directly before r processes the respective command)? In other words, can I savely edit the scriptname.R file even when an active R process still runs the command above? Experiment. Build a toy script with a loop that never terminates. Set it going. Edit the script and change the code so that the loop terminates. See what happens. [It seems to me that nothing happens, so that you *can* safely edit the script while the process runs. But further experimentation would be advisable.] cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] column and line graphs in R
On 03/15/2013 01:40 AM, Gian Maria Niccolò Benucci wrote: Hi again, Thank you all for your support. I would love to have a graph in which two variables are contemporary showed. For example a histogram and a curve should be the perfect choice. I tried to use twoord.plot() but I am not sure I understand how to manage the the arguments lx, ly, rx, ry... Anyway these are my data: nat_af rel.abund rel.freq MOTU2 0.0030.083 MOTU4 0.0290.167 MOTU6 0.0330.167 MOTU7 0.0230.083 MOTU9 0.0090.083 MOTU11 0.0420.250 MOTU14 0.0690.083 MOTU16 0.0590.167 MOTU17 0.0340.083 MOTU18 0.0490.083 MOTU19 0.0840.333 MOTU20 0.0150.083 MOTU21 0.0590.083 MOTU22 0.0320.167 MOTU23 0.1420.250 MOTU24 0.0310.083 MOTU25 0.0340.083 MOTU29 0.0100.083 MOTU30 0.0110.083 MOTU33 0.0040.083 MOTU36 0.0340.333 MOTU34 0.1820.417 First column is the relative abundance of the given MOTU and second column is the relative frequency of the same MOTU. Hi Gian, You can do this in twoord.plot like this (data is named nat_af and the first column is labeled label): twoord.plot(1:22-0.2,nat_af$rel.abund,1:22+0.2,nat_af$rel.freq, type=c(bar,bar),lylim=c(0,0.19),rylim=c(0,0.43),halfwidth=0.2, main=Abundance and frequency,ylab=Abundance,rylab=Frequency, xticklab=rep(,22)) staxlab(1,at=1:22,labels=nat_af$label,cex=0.8,srt=45) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data manipulation
Hello all, I would appreciate your thoughts on a seemingly simple problem. I have a database, where each row represent a single record. I want to aggregate this database so I use the aggregate command : D-read.csv(C:\\Users\\test.csv) attach(D) by1-factor(Class) by2-factor(X) W-aggregate(x=Count,by=list(by1,by2),FUN=sum) The results I get following the form: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 5 3 0.2 4 6 3 0.3 4 However, what I really want is an aggregation which includes the zero values, i.e.: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 2 0.2 0 5 3 0.2 4 10.3 0 20.3 0 6 3 0.3 4 How can I achieve what I want? Best regards, Ioanna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data manipulation
What zero values? And are they acutall zeros or are the NA's, that is, missing values? The code looks okay but without some sample data it is difficult to know exactly what you are doing. The easiest way to supply data is to use the dput() function. Example with your file named testfile: dput(testfile) Then copy the output and paste into your email. For large data sets, you can just supply a representative sample. Usually, dput(head(testfile, 100)) will be sufficient. http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example Please supply some sample data. John Kane Kingston ON Canada -Original Message- From: ii54...@msn.com Sent: Fri, 15 Mar 2013 12:40:54 + To: r-help@r-project.org Subject: [R] Data manipulation Hello all, I would appreciate your thoughts on a seemingly simple problem. I have a database, where each row represent a single record. I want to aggregate this database so I use the aggregate command : D-read.csv(C:\\Users\\test.csv) attach(D) by1-factor(Class) by2-factor(X) W-aggregate(x=Count,by=list(by1,by2),FUN=sum) The results I get following the form: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 5 3 0.2 4 6 3 0.3 4 However, what I really want is an aggregation which includes the zero values, i.e.: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 2 0.2 0 5 3 0.2 4 10.3 0 20.3 0 6 3 0.3 4 How can I achieve what I want? Best regards, Ioanna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a hyperlink in a csv file
On Fri, Mar 15, 2013 at 10:52 AM, Brian Smith bsmith030...@gmail.com wrote: Hi, I was wondering if it is possible to create a hyperlink in a csv file using R code and some package. For example, in the following code: A csv file is a plan text file and by definition doesn't have hyperlinks. If you want a hyperlink, you'll need to export to a different format or use a reader which will interpret a URL as a hyperlink automatically. MW __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to list the all products' information of the latest month?
dat1- read.table(text= Product Price Year_Month PE A 100 201012 -2 A 98 201101 -3 A 97 201102 -2.5 B 110 201101 -1 B 100 201102 -2 B 90 201103 -4 ,sep=,header=TRUE,stringsAsFactors=FALSE) dat1[as.logical(with(dat1,ave(Year_Month,Product,FUN=function(x) x==max(x,] # Product Price Year_Month PE #3 A 97 201102 -2.5 #6 B 90 201103 -4.0 A.K. - Original Message - From: Tammy Ma metal_lical...@live.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Friday, March 15, 2013 5:56 AM Subject: [R] How to list the all products' information of the latest month? Hi, I have data frame like this: Product Price Year_Month PE A 100 201012 -2 A 98 201101 -3 A 97 201102 -2.5 B 110 201101 -1 B 100 201102 -2 B 90 201103 -4 How can I achieve the following result using R: Product Price Year_Month PE A 97 201102 -2.5 B 90 201103 -4 in other words, list the all products' information of the latest month? Thanks for your help. Kind regards, Lingyi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reviewer comment
No idea of what sentence. R-help strips any html and only provides a text message so all formatting has been lost. I think the question is not really an R-help question but if you resubmit the post you need to show the sentence in question in another way. John Kane Kingston ON Canada -Original Message- From: mohamed.laj...@inserm.fr Sent: Fri, 15 Mar 2013 11:26:45 +0100 To: r-help@r-project.org Subject: [R] reviewer comment Could someone explain me this sentence reviewer below in blod underlined, Authors should try to be more detailed in the description of analyses: some of the details reported in the Principal components analysis paragraph (Results) should be moved here. Because a highly_/*asymmetric distribution could affect Principal Component Analysis results, symmetry of distribution should be tested. Authors should also indicate if outliers were observed and consequently excluded because they could affect factors*/_ Any help would be greatly appreciated! Regards ML -- Mohamed Lajnef,IE INSERM U955 eq 15# P?le de Psychiatrie # H?pital CHENEVIER # 40, rue Mesly # 94010 CRETEIL Cedex FRANCE # mohamed.laj...@inserm.fr # tel : 01 49 81 32 79 # Sec : 01 49 81 32 90 # fax : 01 49 81 30 99 # [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Looking for a good tutorial on ff package
Hi, I am looking for a good tutorial on the ff package. Any suggestions? Also, any other package would anyone recommend for dealing with data that extends beyond the RAM would be greatly appreciated. Thanks, Fritz Zuhl __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing a hyperlink to a csv file
Well you can write it there but it won't do anything until read into some software that can interpret it as a url. A csv file is just plain text. John Kane Kingston ON Canada -Original Message- From: bsmith030...@gmail.com Sent: Fri, 15 Mar 2013 07:53:02 -0400 To: r-help@r-project.org Subject: [R] Writing a hyperlink to a csv file Hi, I was wondering if it is possible to create a hyperlink in a csv file using R code and some package. For example, in the following code: links - cbind(rep('Click for Google',3),google search address goes here) ## R Mailing list blocks if I put the actual web address here write.table(links,'test.csv', sep=',',row.names=F,col.names=F) the web address should be linked to 'Click for Google'. many thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Send your photos by email in seconds... TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if3 Works in all emails, instant messengers, blogs, forums and social networks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] metafor - multivariate analysis
Dear Metafor users, I'm conducting a metaanalysis of prevalence of a particular behaviour based on someone elses' code. I've been labouring under the impression that this: summary(rma.1-rma(yi,vi,mods=cbind(approxmeanage,interviewmethodcode),data=mal,method=DL,knha=F,weighted=F,intercept=T)) is doing the multivariate analysis that i want, but have read that multivariate analysis can't be done in metafor. this is the output: Mixed-Effects Model (k = 22; tau^2 estimator: DL) logLik Deviance AIC BIC 18.7726 -37.5452 -27.5452 -22.0899 tau^2 (estimate of residual amount of heterogeneity): 0.0106 tau (sqrt of the estimate of residual heterogeneity): 0.1031 Test for Residual Heterogeneity: QE(df = 18) = 1273.9411, p-val .0001 Test of Moderators (coefficient(s) 2,3,4): QM(df = 3) = 11.0096, p-val = 0.0117 Model Results: estimate se zval pval ci.lb ci.ub intrcpt 0.4014 0.1705 2.3537 0.0186 0.0671 0.7356 * continent -0.0206 0.0184 -1.1200 0.2627 -0.0568 0.0155 approxmeanage 0.0076 0.0091 0.8354 0.4035 -0.0102 0.0254 interviewmethodcode -0.0892 0.0273 -3.2702 0.0011 -0.1426 -0.0357 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 my questions are 1. what is this line of code? 2. if it isn't multivariate analysis, will i have to use the mvmeta instead. thanks very much for any help Branwen .http://r.789695.n4.nabble.com/metafor-multivariate-analysis-td4661233.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data manipulation
Is this what you want to do? D2 - expand.grid(Class=unique(D$Class), X=unique(D$X)) D2 - merge(D2, D, all=TRUE) D2$Count[is.na(D2$Count)] - 0 W - aggregate(D2$Count, list(D2$Class, D2$X), sum) W Best, Nello -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of IOANNA Sent: Freitag, 15. März 2013 13:41 To: r-help@r-project.org Subject: [R] Data manipulation Hello all, I would appreciate your thoughts on a seemingly simple problem. I have a database, where each row represent a single record. I want to aggregate this database so I use the aggregate command : D-read.csv(C:\\Users\\test.csv) attach(D) by1-factor(Class) by2-factor(X) W-aggregate(x=Count,by=list(by1,by2),FUN=sum) The results I get following the form: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 5 3 0.2 4 6 3 0.3 4 However, what I really want is an aggregation which includes the zero values, i.e.: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 2 0.2 0 5 3 0.2 4 10.3 0 20.3 0 6 3 0.3 4 How can I achieve what I want? Best regards, Ioanna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Poisson and negbin gamm in mgcv - overdispersion and theta
Dear R users, I am trying to use gamm from package mgcv to model results from a mesocosm experiment. My model is of type M1 - gamm(Resp ~ s(Day, k=8) + s(Day, by=C, k=8) + Flow + offset(LogVol), data=MyResp, correlation = corAR1(form= ~ Day|Mesocosm), family=poisson(link=log)) where the response variable is counts, offset by the log of sample volume. Unfortunately, the residuals from the model show heteroscedasticity. While trying to follow up on this, I have run into following problems: 1) How to estimate the overdispersion parameter from a (Poisson) gamm? I have not been able to extract residual degrees of freedom from M1. 2) How to manually estimate theta for a negative binomial gamm? I would like to see if applying a negative binomial distribution with log link (model below) would solve the problem. However, negbin in gamm requires a known theta... M2 - gamm(Resp ~ s(Day, k=8) + s(Day, by=C, k=8) + Flow + offset(LogVol), data=MyResp, correlation = corAR1(form= ~ Day|Mesocosm), family= negbin(THETA, link=log)) 3) And finally, can I somehow compare the models M1 and M2? Trying anova(M1,M2) gives the message: Error in eval(expr, envir, enclos) : object 'fixed' not found (and I am anyway not sure if this is a valid approach between Poisson and negbin gamms). I am most grateful for any help! Aino Aino Hosia Postdoc Havforskningsinstituttet/Institute of Marine Research PO Box 1870 Nordnes, N-5817 Bergen, Norway (Nordnesgaten 50) Tel: +47 55 23 53 49 E-mail: aino.ho...@imr.nomailto:aino.ho...@imr.no www.imr.nohttp://www.imr.no/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data manipulation
Hi IOANNA, I got the data but it is missing a value in Count (length 22 vs length 23 in the other two variable so I stuck in an extra 1. I hope this is correct. There also was an attachement called winmail.dat that appears to be some kind of MicroSoft Mail note that is pure gibberish to me--I'm on a Linux box. For some reason in neither posting does your example of the output you want come through. Are you posting in html ? R-help strips any html so is there a change it stripped out a table? If i do this table(Class, X) X Class 0.1 0.2 0.3 1 4 3 0 2 7 0 0 3 1 4 4 I see that you have two combinations of Class and X with no entries. Is this what you wanted to show in W? If so, it is not immediately apparent how to go about this. John Kane Kingston ON Canada -Original Message- From: ii54...@msn.com Sent: Fri, 15 Mar 2013 13:11:48 + To: jrkrid...@inbox.com, r-help@r-project.org Subject: RE: [R] Data manipulation Hello John, I thought I attached the file. So here we go: Class=c(1,1,1,1, 1,1,1,2,2,2,2,2,2,2,3,3, 3,3,3,3, 3,3,3) X=c(0.1,0.1,0.1, 0.1,0.2,0.2,0.2,0.1,0.1, 0.1,0.1,0.1,0.1,0.1,0.1,0.2,0.2,0.2,0.2,0.3,0.3,0.3, 0.3) Count=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1) by1-factor(Class) by2-factor(X) W-aggregate(x=Count,by=list(by1,by2),FUN=sum) However, what I want is a table that also include lines for the Group.1 and Group.2 values for which there are no records. In other words something like this: Thanks again. I hope its clearer now. Ioanna -Original Message- From: John Kane [mailto:jrkrid...@inbox.com] Sent: 15 March 2013 12:51 To: IOANNA; r-help@r-project.org Subject: RE: [R] Data manipulation What zero values? And are they acutall zeros or are the NA's, that is, missing values? The code looks okay but without some sample data it is difficult to know exactly what you are doing. The easiest way to supply data is to use the dput() function. Example with your file named testfile: dput(testfile) Then copy the output and paste into your email. For large data sets, you can just supply a representative sample. Usually, dput(head(testfile, 100)) will be sufficient. http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducibl e-example Please supply some sample data. John Kane Kingston ON Canada -Original Message- From: ii54...@msn.com Sent: Fri, 15 Mar 2013 12:40:54 + To: r-help@r-project.org Subject: [R] Data manipulation Hello all, I would appreciate your thoughts on a seemingly simple problem. I have a database, where each row represent a single record. I want to aggregate this database so I use the aggregate command : D-read.csv(C:\\Users\\test.csv) attach(D) by1-factor(Class) by2-factor(X) W-aggregate(x=Count,by=list(by1,by2),FUN=sum) The results I get following the form: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 5 3 0.2 4 6 3 0.3 4 However, what I really want is an aggregation which includes the zero values, i.e.: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 2 0.2 0 5 3 0.2 4 10.3 0 20.3 0 6 3 0.3 4 How can I achieve what I want? Best regards, Ioanna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more! FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reviewer comment
Thanks John for your reply. the reviewer comment: asymmetric distribution could affect Principal Component Analysis results, symmetry of distribution should be tested. Authors should also indicate if outliers were observed and consequently excluded because they could affect factors My question: what does it mean asymmetry distribution could affect PCA ? and also outliers could affect factors? sorry for this not R-help question. Best regards M Le 15/03/13 14:05, John Kane a écrit : No idea of what sentence. R-help strips any html and only provides a text message so all formatting has been lost. I think the question is not really an R-help question but if you resubmit the post you need to show the sentence in question in another way. John Kane Kingston ON Canada -Original Message- From: mohamed.laj...@inserm.fr Sent: Fri, 15 Mar 2013 11:26:45 +0100 To: r-help@r-project.org Subject: [R] reviewer comment Could someone explain me this sentence reviewer below in blod underlined, Authors should try to be more detailed in the description of analyses: some of the details reported in the Principal components analysis paragraph (Results) should be moved here. Because a highly_/*asymmetric distribution could affect Principal Component Analysis results, symmetry of distribution should be tested. Authors should also indicate if outliers were observed and consequently excluded because they could affect factors*/_ Any help would be greatly appreciated! Regards ML -- Mohamed Lajnef,IE INSERM U955 eq 15# P?le de Psychiatrie # H?pital CHENEVIER # 40, rue Mesly # 94010 CRETEIL Cedex FRANCE # mohamed.laj...@inserm.fr # tel : 01 49 81 32 79# Sec : 01 49 81 32 90 # fax : 01 49 81 30 99 # [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk⢠and most webmails -- Mohamed Lajnef,IE INSERM U955 eq 15# PÂle de Psychiatrie # HÂpital CHENEVIER # 40, rue Mesly # 94010 CRETEIL Cedex FRANCE # mohamed.laj...@inserm.fr # tel : 01 49 81 32 79 # Sec : 01 49 81 32 90 # fax : 01 49 81 30 99 # [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data manipulation
Nice. That does look like it. IOANNA? John Kane Kingston ON Canada -Original Message- From: nbla...@ispm.unibe.ch Sent: Fri, 15 Mar 2013 14:27:03 +0100 To: ii54...@msn.com, r-help@r-project.org Subject: Re: [R] Data manipulation Is this what you want to do? D2 - expand.grid(Class=unique(D$Class), X=unique(D$X)) D2 - merge(D2, D, all=TRUE) D2$Count[is.na(D2$Count)] - 0 W - aggregate(D2$Count, list(D2$Class, D2$X), sum) W Best, Nello -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of IOANNA Sent: Freitag, 15. März 2013 13:41 To: r-help@r-project.org Subject: [R] Data manipulation Hello all, I would appreciate your thoughts on a seemingly simple problem. I have a database, where each row represent a single record. I want to aggregate this database so I use the aggregate command : D-read.csv(C:\\Users\\test.csv) attach(D) by1-factor(Class) by2-factor(X) W-aggregate(x=Count,by=list(by1,by2),FUN=sum) The results I get following the form: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 5 3 0.2 4 6 3 0.3 4 However, what I really want is an aggregation which includes the zero values, i.e.: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 2 0.2 0 5 3 0.2 4 10.3 0 20.3 0 6 3 0.3 4 How can I achieve what I want? Best regards, Ioanna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about nls
Actually, it likely won't matter where you start. The Gauss-Newton direction is nearly always close to 90 degrees from the gradient, as seen by turning trace=TRUE in the package nlmrt function nlxb(), which does a safeguarded Marquardt calculation. This can be used in place of nls(), except you need to put your data in a data frame. It finds a solution pretty straightforwardly, though with quite a few iterations and function evaluations. Of course, one may not really want to do any statistics with 4 observations and 3 parameters, but the problem illustrates the GN vs. Marquardt directions. JN sol-nlxb(y ~ exp(a + b*x)+d,start=list(a=0,b=0,d=1), data=mydata, trace=T) formula: y ~ exp(a + b * x) + d lower:[1] -Inf -Inf -Inf upper:[1] Inf Inf Inf ...snip... Data variable y :[1] 0.8 6.5 20.5 45.9 Data variable x :[1] 60 80 100 120 Start:lamda: 1e-04 SS= 2291.15 at a = 0 b = 0 d = 1 1 / 0 gradient projection = -2191.093 g-delta-angle= 90.47372 Stepsize= 1 lamda: 0.001 SS= 4.408283e+55 at a = -25.29517 b = 0.74465 d = -24.29517 2 / 1 gradient projection = -2168.709 g-delta-angle= 90.48307 Stepsize= 1 lamda: 0.01 SS= 3.986892e+54 at a = -24.55223 b = 0.7284461 d = -23.55223 3 / 1 gradient projection = -1991.804 g-delta-angle= 90.58199 Stepsize= 1 lamda: 0.1 SS= 2.439544e+46 at a = -18.71606 b = 0.6010118 d = -17.71606 4 / 1 gradient projection = -1476.935 g-delta-angle= 92.79733 Stepsize= 1 lamda: 1 SS= 4.114152e+23 at a = -2.883776 b = 0.2505892 d = -1.883776 5 / 1 gradient projection = -954.5234 g-delta-angle= 91.78881 Stepsize= 1 lamda: 10 SS= 39033042903 at a = 2.918809 b = 0.07709855 d = 3.918809 6 / 1 gradient projection = -264.9953 g-delta-angle= 91.41647 Stepsize= 1 lamda: 4 SS= 571.451 at a = 1.023367 b = 0.01762421 d = 2.023367 7 / 1 gradient projection = -60.46016 g-delta-angle= 90.96421 Stepsize= 1 lamda: 1.6 SS= 462.3257 at a = 1.080764 b = 0.0184132 d = 1.981399 8 / 2 gradient projection = -56.91866 g-delta-angle= 90.08103 Stepsize= 1 lamda: 0.64 SS= 359.6233 at a = 1.135265 b = 0.01942354 d = 0.9995471 9 / 3 gradient projection = -65.90027 g-delta-angle= 90.04527 Stepsize= 1 ... snip ... lamda: 0.2748779 SS= 0.5742948 at a = -0.1491842 b = 0.03419761 d = -6.196575 31 / 20 gradient projection = -6.998402e-25 g-delta-angle= 90.07554 Stepsize= 1 lamda: 2.748779 SS= 0.5742948 at a = -0.1491842 b = 0.03419761 d = -6.196575 32 / 20 gradient projection = -2.76834e-25 g-delta-angle= 90.16973 Stepsize= 1 lamda: 27.48779 SS= 0.5742948 at a = -0.1491842 b = 0.03419761 d = -6.196575 33 / 20 gradient projection = -4.632864e-26 g-delta-angle= 90.08759 Stepsize= 1 No parameter change On 13-03-15 07:00 AM, r-help-requ...@r-project.org wrote: Message: 36 Date: Thu, 14 Mar 2013 11:04:27 -0400 From: Gabor Grothendieckggrothendi...@gmail.com To: menglaomen...@163.com Cc: R helpr-help@r-project.org Subject: Re: [R] question about nls Message-ID: cap01urmodfn87qqvtwmatuid0fx0d7lqmfqh4chofm5b2c9...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 On Thu, Mar 14, 2013 at 5:07 AM, menglaomen...@163.com wrote: Hi,all: I met a problem of nls. My data: xy 60 0.8 80 6.5 100 20.5 120 45.9 I want to fit exp curve of data. My code: nls(y ~ exp(a + b*x)+d,start=list(a=0,b=0,d=1)) Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates I can't find out the reason for the error. Any suggesions are welcome. The gradient is singular at your starting value so you will have to use a better starting value. If d = 0 then its linear in log(y) so you can compute a starting value using lm like this: lm1 - lm(log(y) ~ x, DF) st - setNames(c(coef(lm1), 0), c(a, b, d)) Also note that you are trying to fit a model with 3 parameters to only 4 data points. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reviewer comment
I think this is more a question for something like Cross Validated but you may well get a hint or two here. Unfortunately while I vaguely see what the reviewer is getting at I certainly don't know enough to help. John Kane Kingston ON Canada -Original Message- From: mohamed.laj...@inserm.fr Sent: Fri, 15 Mar 2013 14:38:10 +0100 To: jrkrid...@inbox.com Subject: Re: [R] reviewer comment Thanks John for your reply. the reviewer comment: asymmetric distribution could affect Principal Component Analysis results, symmetry of distribution should be tested. Authors should also indicate if outliers were observed and consequently excluded because they could affect factors My question: what does it mean asymmetry distribution could affect PCA ? and a lso outliers could affect factors? sorry for this not R-help question. Best regards M Le 15/03/13 14:05, John Kane a écrit : No idea of what sentence. R-help strips any html and only provides a text mess age so all formatting has been lost. I think the question is not really an R-h elp question but if you resubmit the post you need to show the sentence in ques tion in another way. John Kane Kingston ON Canada -Original Message- From: [1]mohamed.laj...@inserm.fr Sent: Fri, 15 Mar 2013 11:26:45 +0100 To: [2]r-help@r-project.org Subject: [R] reviewer comment Could someone explain me this sentence reviewer below in blod underlined, Authors should try to be more detailed in the description of analyses: some of the details reported in the Principal components analysis paragraph (Results) should be moved here. Because a highly_/*asymmetric distribution could affect Principal Component Analysis results, symmetry of distribution should be tested. Authors should also indicate if outliers were observed and consequently excluded because they could affect factors*/_ Any help would be greatly appreciated! Regards ML -- Mohamed Lajnef,IE INSERM U955 eq 15# P?le de Psychiatrie# H?pital CHENEVIER # 40, rue Mesly # 94010 CRETEIL Cedex FRANCE # [3]mohamed.laj...@inserm.fr # tel : 01 49 81 32 79 # Sec : 01 49 81 32 90 # fax : 01 49 81 30 99 # [[alternative HTML version deleted]] __ [4]R-help@r-project.org mailing list [5]https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide [6]http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at [7]http://www.inbox.com/sm ileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most webmails -- Mohamed Lajnef,IE INSERM U955 eq 15# Ple de Psychiatrie# Hpital CHENEVIER # 40, rue Mesly # 94010 CRETEIL Cedex FRANCE # [8]mohamed.laj...@inserm.fr # tel : 01 49 81 32 79 # Sec : 01 49 81 32 90 # fax : 01 49 81 30 99 # _ Free Online Photosharing - Share your photos online with your friends and family! Visit [9]http://www.inbox.com/photosharing to find out more! References 1. mailto:mohamed.laj...@inserm.fr 2. mailto:r-help@r-project.org 3. mailto:mohamed.laj...@inserm.fr 4. mailto:R-help@r-project.org 5. https://stat.ethz.ch/mailman/listinfo/r-help 6. http://www.R-project.org/posting-guide.html 7. http://www.inbox.com/smileys 8. mailto:mohamed.laj...@inserm.fr 9. http://www.inbox.com/photosharing __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data manipulation
Thanks a lot! -Original Message- From: John Kane [mailto:jrkrid...@inbox.com] Sent: 15 March 2013 13:41 To: Blaser Nello; IOANNA; r-help@r-project.org Subject: Re: [R] Data manipulation Nice. That does look like it. IOANNA? John Kane Kingston ON Canada -Original Message- From: nbla...@ispm.unibe.ch Sent: Fri, 15 Mar 2013 14:27:03 +0100 To: ii54...@msn.com, r-help@r-project.org Subject: Re: [R] Data manipulation Is this what you want to do? D2 - expand.grid(Class=unique(D$Class), X=unique(D$X)) D2 - merge(D2, D, all=TRUE) D2$Count[is.na(D2$Count)] - 0 W - aggregate(D2$Count, list(D2$Class, D2$X), sum) W Best, Nello -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of IOANNA Sent: Freitag, 15. März 2013 13:41 To: r-help@r-project.org Subject: [R] Data manipulation Hello all, I would appreciate your thoughts on a seemingly simple problem. I have a database, where each row represent a single record. I want to aggregate this database so I use the aggregate command : D-read.csv(C:\\Users\\test.csv) attach(D) by1-factor(Class) by2-factor(X) W-aggregate(x=Count,by=list(by1,by2),FUN=sum) The results I get following the form: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 5 3 0.2 4 6 3 0.3 4 However, what I really want is an aggregation which includes the zero values, i.e.: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 2 0.2 0 5 3 0.2 4 10.3 0 20.3 0 6 3 0.3 4 How can I achieve what I want? Best regards, Ioanna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to make the labels of pie chart are not overlapping?
I have the following dataframe: Productpredicted_MarketShare Predicted_MS_Percentage A2.827450e-02 2.8 B4.716403e-06 0.0 C1.741686e-01 17.4 D 1.716303e-04 0.0 ... Because there are so many products, and most of predicted Market share is around 0%. When I make pie chart, the labels of those product with 0% market share are overlapping. How do I make the labels are not overlapping? Kind regards. Tammy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confidence interval for survfit
The first thing you are missing is the documentation -- try ?survfit.object. fit - survfit(Surv(time,status)~1,data) fit$std.err will contain the standard error of the cumulative hazard or -log(survival) The standard error of the survival curve is approximately S(t) * std(hazard), by the delta method. This is what is printed by the summary function, because it is what user's expect, but it has very poor performance for computing confidence intervals. A much better one is exp(-1* confidence interval for the cumulative hazard), which is the default. In fact there are lots of better ones whose relative ranking depends on the details of your simulation study. About the only really consistent result is that anything thoughtful beats S(t) +- 1.96 se(S), easily. The default in R is the one that was best in the most recent paper I had read at the time I set the default. If I were to rank them today using an average over all the comparison papers it would be second or third, but the good methods are so close that in practical terms it hardly matters. Terry Therneau On 03/15/2013 06:00 AM, r-help-requ...@r-project.org wrote: Hi, I am wondering how the confidence interval for Kaplan-Meier estimator is calculated by survfit(). For example,? summary(survfit(Surv(time,status)~1,data),times=10) Call: survfit(formula = Surv(rtime10, rstat10) ~ 1, data = mgi) ?time n.risk n.event survival std.err lower 95% CI upper 95% CI ?? 10 ?? 168? 55??? 0.761? 0.0282??? 0.707??? 0.818 I am trying to reproduce the upper and lower CI by using standard error. As far I understand, the default method for survfit() to calculate confidence interval is on the log survival scale, so: upper CI = exp(log(0.761)+qnorm(0.975)*0.0282) = 0.804 lower CI = exp(log(0.761)-qnorm(0.975)*0.0282) = 0.720 they are not the same as the output from survfit(). Am I missing something? Thanks John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about nls
On Fri, Mar 15, 2013 at 9:45 AM, Prof J C Nash (U30A) nas...@uottawa.ca wrote: Actually, it likely won't matter where you start. The Gauss-Newton direction is nearly always close to 90 degrees from the gradient, as seen by turning trace=TRUE in the package nlmrt function nlxb(), which does a safeguarded Marquardt calculation. This can be used in place of nls(), except you need to put your data in a data frame. It finds a solution pretty straightforwardly, though with quite a few iterations and function evaluations. Interesting observation but it does converge in 5 iterations with the improved starting value whereas it fails due to a singular gradient with the original starting value. Lines - + xy + 60 0.8 + 80 6.5 + 100 20.5 + 120 45.9 + DF - read.table(text = Lines, header = TRUE) # original starting value - singular gradient nls(y ~ exp(a + b*x)+d,DF,start=list(a=0,b=0,d=1)) Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates # better starting value - converges in 5 iterations lm1 - lm(log(y) ~ x, DF) st - setNames(c(coef(lm1), 0), c(a, b, d)) nls(y ~ exp(a + b*x)+d, DF, start=st) Nonlinear regression model model: y ~ exp(a + b * x) + d data: DF a b d -0.1492 0.0342 -6.1966 residual sum-of-squares: 0.5743 Number of iterations to convergence: 5 Achieved convergence tolerance: 6.458e-07 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grep with wildcards across multiple columns
I think the way I set up my sample data without any explanation confused things slightly. These data might make things clearer: # Create fake data df - data.frame(code = c(rep(1001, 8), rep(1002, 8)), year = rep(c(rep(2011, 4), rep(2012, 4)), 2), fund = rep(c(10E, 27E, 27E, 29E), 4), func = rep(c(11, 122000, 214000, 158000), 4), obj= rep(c(100, 100, 210, 220), 4), amount = round(rnorm(16, 5, 1))) These are financial data with a hierarchical account structure where a zero represents a summary account that rolls up all the accounts at subsequent digits (e.g. 10 rolls up 11, 122000, 158000, etc.). I was trying to do two things with the search parameters: turn zeroes into question marks, and duplicate the functionality of a SQL query using those question marks as wildcards: # Set parameters par.fund - 20E; par.func - 10; par.obj - 000 par.fund - glob2rx(gsub(0, ?, par.fund)) par.func - glob2rx(gsub(0, ?, par.func)) par.obj - glob2rx(gsub(0, ?, par.obj)) Fortunately, Bill's suggestion to use the intersect function worked just fine--since intersect accepts only two arguments, I had to nest a pair of statements: # Solution: Use a pair of nested intersects dt2 - dt[intersect(intersect(grep(par.fund, fund), grep(par.func, func)), grep(par.obj, obj)), sum(amount), by=c('code', 'year')] df2 - ddply(df[intersect(intersect(grep(par.fund, df$fund), grep(par.func, df$func)), grep(par.obj, df$obj)), ], .(code, year), summarize, amount = sum(amount)) Thanks for your ideas! DB Daniel Bush | School Finance Consultant School Financial Services | Wis. Dept. of Public Instruction daniel.bush -at- dpi.wi.gov | 608-267-9212 -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: Thursday, March 14, 2013 5:49 PM To: Bush, Daniel P. DPI; 'r-help@r-project.org' Subject: RE: Grep with wildcards across multiple columns grep(pattern, textVector) returns of the integer indices of the elements of textVector that match the pattern. E.g., grep(T, c(One,Two,Three,Four)) [1] 2 3 The '' operator naturally operates on logical vectors of the same length (If you give it numbers it silently converts 0 to FALSE and other numbers to TRUE.) The two don't fit together. You could use grepl(), which returns a logical vector the length of textVector, as in grepl(p1,v1) grepl(p2,v2) to figure which entries in the table have v1 matching p1 and v2 matching p2. Or, you could use intersect(grep(p1,v1), grep(p2,v2)) if you want to stick with integer indices. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bush, Daniel P. DPI Sent: Thursday, March 14, 2013 2:43 PM To: 'r-help@r-project.org' Subject: [R] Grep with wildcards across multiple columns I have a fairly large data set with six variables set up like the following dummy: # Create fake data df - data.frame(code = c(rep(1001, 8), rep(1002, 8)), year = rep(c(rep(2011, 4), rep(2012, 4)), 2), fund = rep(c(10E, 10E, 10E, 27E), 4), func = rep(c(11, 122000, 214000, 158000), 4), obj= rep(100, 16), amount = round(rnorm(16, 5, 1))) What I would like to do is sum the amount variable by code and year, filtering rows using different wildcard searches in each of three columns: 1?E in fund, 1?? in func, and ??? in obj. I'm OK turning these into regular expressions: # Set parameters par.fund - 10E; par.func - 10; par.obj - 000 par.fund - glob2rx(gsub(0, ?, par.fund)) par.func - glob2rx(gsub(0, ?, par.func)) par.obj - glob2rx(gsub(0, ?, par.obj)) The problem occurs when I try to apply multiple greps across columns. I'd prefer to use data.table since it's so much faster than plyr and I have 159 different sets of parameters to run through, but I get the same error setting it up either way: # Doesn't work library(data.table) dt - data.table(df) eval(parse(text=paste( dt2 - dt[, grep(', par.fund, ', fund) , grep(', par.func, ', func) grep(', par.obj, ', obj), , sum(amount), by=c('code', 'year')] , sep=))) # Warning message: # In grep(^1.E$, fund) grep(^1.$, func) : # longer object length is not a multiple of shorter object length # Also doesn't work library(plyr) eval(parse(text=paste( df2 - ddply(df[grep(', par.fund, ', df$fund) , grep(', par.func, ', df$func) grep(', par.obj, ', df$obj), ], , .(code, year), summarize, amount = sum(amount)) , sep=))) # Warning message: # In grep(^1.E$, df$fund) grep(^1.$, df$func) : # longer object length is not a multiple of shorter object length Clearly, the
Re: [R] Reassign values based on multiple conditions
I don't see how the data in the three column table you present is enough to produce the four column test. Should the first table actually show repeated collar usage so that you can use the next incidence of the collar as the end date e.g 1 01/01/2013 1 02/04/2013 and so on? Some actual data might be useful. The easiest way to supply data is to use the dput() function. Example with your file named testfile: dput(testfile) Then copy the output and paste into your email. For large data sets, you can just supply a representative sample. Usually, dput(head(testfile, 100)) will be sufficient. Sorry I'm not more helpful John Kane Kingston ON Canada -Original Message- From: cat.e.co...@gmail.com Sent: Fri, 15 Mar 2013 12:46:13 +0800 To: r-help@r-project.org Subject: [R] Reassign values based on multiple conditions Hi all, I have a simple data frame of three columns - one of numbers (really a categorical variable), one of dates and one of data. Imagine: collar date data 1 01/01/2013 x 2 02/01/2013 y 3 04/01/2013 z 4 04/01/2013 a 5 07/01/2013 b The 'collar' is a GPS collar that's been worn by an animal for a certain amount of time, and then may have been worn by a different animal after changes when the batteries needed to be changed. When an animal was caught and the collar battery needed to be changed, a whole new collar had to be put on, as these animals (wild boar and red deer!) were not that easy to catch. In order to follow the movements of each animal I now need to create a new column that assigns the 'data' by animal rather than by collar. I have a table of dates, e.g animal collar start_dateend_date 1 1 01/01/2013 03/01/2013 1 5 04/01/2013 06/01/2013 1 3 07/01/2013 09/01/2013 2 2 01/01/2013 03/01/2013 2 1 04/01/2013 06/01/2013 I have so far been able to make multi-conditional tests: animal1test- (date=01/01/13 date=03/01/13) animal1test2- (date=04/01/13 date=06/01/13) animal2test- (date=04/01/13 date=06/01/13) to use in an 'if else' formula: if(animal1test){ collar[1]=animal1 } else if(animal1test2){ collar[5]=animal1 }else if(animal2test) collar[1]=animal2 }else NA As I'm sure you can see, this is completely inelegant, and also not working for me! Any ideas on how to a achieve this? Thanks SO much in advance, Cat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reviewer comment
My question: what does it mean asymmetry distribution could affect PCA ? and also outliers could affect factors? It means what it says. PCA will be affected by asymmetry and outliers will affect the principal components (sometimes loosely called 'factors') In particular an extreme outlying data point can cause at least one PC to be essentially parallel to the vector between the outlier and the mean of the rest of the data. If you want a picture of factors describing the bulk of the data set, you need to chuck out the extreme points or use robust PCA. Asymmetry I'd worry less about, at least for exploratory graphical presentation; if I had a nice spherical data set I'd probably not be very interested in the PCA because it'd not have much discriminatory power for groups. But inference based on things like mahalanobis distance often relies on some sense of multivariate normality or the like, and if the model used for inference isn't built on a symmetric data set the inferences can be badly wrong. Think Turkish flag; the star is 'obviously' not part of the crescent, but in mahalanobis distance it's not much further from the (empty) centre of the crescent than most of the crescent is. *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Combinations
Hi every one, I have two sets T1={c1,c2,..,cn} and T2={k1,k2,...,kn}. How can I find the sets as follow: (c1,k1), (c1,k2) ...(c1,kn) (c2,k1) (c2,k2) (c2,kn) ... (cn,kn) Thanks. Amir -- __ Amir Darehshoorzadeh | Computer Engineering Department PostDoc Fellow | University of Ottawa, PARADISE LAb Email: adare...@uottawa.ca | 800 King Edward Ave Tel: - | ON K1N 6N5, Ottawa - CANADA http://personals.ac.upc.edu/amir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data manipulation
Hello John, I thought I attached the file. So here we go: Class=c(1,1,1,1,1,1,1,2,2,2,2,2,2,2,3,3, 3,3,3,3,3,3,3) X=c(0.1,0.1,0.1,0.1,0.2,0.2,0.2,0.1,0.1, 0.1,0.1,0.1,0.1,0.1,0.1,0.2,0.2,0.2,0.2,0.3,0.3,0.3,0.3) Count=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1) by1-factor(Class) by2-factor(X) W-aggregate(x=Count,by=list(by1,by2),FUN=sum) However, what I want is a table that also include lines for the Group.1 and Group.2 values for which there are no records. In other words something like this: Thanks again. I hope its clearer now. Ioanna -Original Message- From: John Kane [mailto:jrkrid...@inbox.com] Sent: 15 March 2013 12:51 To: IOANNA; r-help@r-project.org Subject: RE: [R] Data manipulation What zero values? And are they acutall zeros or are the NA's, that is, missing values? The code looks okay but without some sample data it is difficult to know exactly what you are doing. The easiest way to supply data is to use the dput() function. Example with your file named testfile: dput(testfile) Then copy the output and paste into your email. For large data sets, you can just supply a representative sample. Usually, dput(head(testfile, 100)) will be sufficient. http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducibl e-example Please supply some sample data. John Kane Kingston ON Canada -Original Message- From: ii54...@msn.com Sent: Fri, 15 Mar 2013 12:40:54 + To: r-help@r-project.org Subject: [R] Data manipulation Hello all, I would appreciate your thoughts on a seemingly simple problem. I have a database, where each row represent a single record. I want to aggregate this database so I use the aggregate command : D-read.csv(C:\\Users\\test.csv) attach(D) by1-factor(Class) by2-factor(X) W-aggregate(x=Count,by=list(by1,by2),FUN=sum) The results I get following the form: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 5 3 0.2 4 6 3 0.3 4 However, what I really want is an aggregation which includes the zero values, i.e.: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 2 0.2 0 5 3 0.2 4 10.3 0 20.3 0 6 3 0.3 4 How can I achieve what I want? Best regards, Ioanna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. family! [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data manipulation
Wouldn't this do the same thing? xtabs(Count~Class+X, D) -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of IOANNA Sent: Friday, March 15, 2013 8:51 AM To: 'John Kane'; 'Blaser Nello'; r-help@r-project.org Subject: Re: [R] Data manipulation Thanks a lot! -Original Message- From: John Kane [mailto:jrkrid...@inbox.com] Sent: 15 March 2013 13:41 To: Blaser Nello; IOANNA; r-help@r-project.org Subject: Re: [R] Data manipulation Nice. That does look like it. IOANNA? John Kane Kingston ON Canada -Original Message- From: nbla...@ispm.unibe.ch Sent: Fri, 15 Mar 2013 14:27:03 +0100 To: ii54...@msn.com, r-help@r-project.org Subject: Re: [R] Data manipulation Is this what you want to do? D2 - expand.grid(Class=unique(D$Class), X=unique(D$X)) D2 - merge(D2, D, all=TRUE) D2$Count[is.na(D2$Count)] - 0 W - aggregate(D2$Count, list(D2$Class, D2$X), sum) W Best, Nello -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of IOANNA Sent: Freitag, 15. März 2013 13:41 To: r-help@r-project.org Subject: [R] Data manipulation Hello all, I would appreciate your thoughts on a seemingly simple problem. I have a database, where each row represent a single record. I want to aggregate this database so I use the aggregate command : D-read.csv(C:\\Users\\test.csv) attach(D) by1-factor(Class) by2-factor(X) W-aggregate(x=Count,by=list(by1,by2),FUN=sum) The results I get following the form: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 5 3 0.2 4 6 3 0.3 4 However, what I really want is an aggregation which includes the zero values, i.e.: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 2 0.2 0 5 3 0.2 4 10.3 0 20.3 0 6 3 0.3 4 How can I achieve what I want? Best regards, Ioanna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grep with wildcards across multiple columns
Hi, You could try this for multiple intersect: dt[Reduce(function(...) intersect(...), list(grep(par.fund,fund),grep(par.func,func),grep(par.obj,obj))),sum(amount),by=c('code','year')] # code year V1 #1: 1001 2011 123528 #2: 1001 2012 97362 #3: 1002 2011 103811 #4: 1002 2012 97179 dt[intersect(intersect(grep(par.fund, fund), grep(par.func, func)), grep(par.obj, obj)), sum(amount), by=c('code', 'year')] # code year V1 #1: 1001 2011 123528 #2: 1001 2012 97362 #3: 1002 2011 103811 #4: 1002 2012 97179 A.K. - Original Message - From: Bush, Daniel P. DPI daniel.b...@dpi.wi.gov To: 'r-help@r-project.org' r-help@r-project.org Cc: 'William Dunlap' wdun...@tibco.com; 'smartpink...@yahoo.com' smartpink...@yahoo.com; 'djmu...@gmail.com' djmu...@gmail.com Sent: Friday, March 15, 2013 10:06 AM Subject: RE: Grep with wildcards across multiple columns I think the way I set up my sample data without any explanation confused things slightly. These data might make things clearer: # Create fake data df - data.frame(code = c(rep(1001, 8), rep(1002, 8)), year = rep(c(rep(2011, 4), rep(2012, 4)), 2), fund = rep(c(10E, 27E, 27E, 29E), 4), func = rep(c(11, 122000, 214000, 158000), 4), obj = rep(c(100, 100, 210, 220), 4), amount = round(rnorm(16, 5, 1))) These are financial data with a hierarchical account structure where a zero represents a summary account that rolls up all the accounts at subsequent digits (e.g. 10 rolls up 11, 122000, 158000, etc.). I was trying to do two things with the search parameters: turn zeroes into question marks, and duplicate the functionality of a SQL query using those question marks as wildcards: # Set parameters par.fund - 20E; par.func - 10; par.obj - 000 par.fund - glob2rx(gsub(0, ?, par.fund)) par.func - glob2rx(gsub(0, ?, par.func)) par.obj - glob2rx(gsub(0, ?, par.obj)) Fortunately, Bill's suggestion to use the intersect function worked just fine--since intersect accepts only two arguments, I had to nest a pair of statements: # Solution: Use a pair of nested intersects dt2 - dt[intersect(intersect(grep(par.fund, fund), grep(par.func, func)), grep(par.obj, obj)), sum(amount), by=c('code', 'year')] df2 - ddply(df[intersect(intersect(grep(par.fund, df$fund), grep(par.func, df$func)), grep(par.obj, df$obj)), ], .(code, year), summarize, amount = sum(amount)) Thanks for your ideas! DB Daniel Bush | School Finance Consultant School Financial Services | Wis. Dept. of Public Instruction daniel.bush -at- dpi.wi.gov | 608-267-9212 -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: Thursday, March 14, 2013 5:49 PM To: Bush, Daniel P. DPI; 'r-help@r-project.org' Subject: RE: Grep with wildcards across multiple columns grep(pattern, textVector) returns of the integer indices of the elements of textVector that match the pattern. E.g., grep(T, c(One,Two,Three,Four)) [1] 2 3 The '' operator naturally operates on logical vectors of the same length (If you give it numbers it silently converts 0 to FALSE and other numbers to TRUE.) The two don't fit together. You could use grepl(), which returns a logical vector the length of textVector, as in grepl(p1,v1) grepl(p2,v2) to figure which entries in the table have v1 matching p1 and v2 matching p2. Or, you could use intersect(grep(p1,v1), grep(p2,v2)) if you want to stick with integer indices. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bush, Daniel P. DPI Sent: Thursday, March 14, 2013 2:43 PM To: 'r-help@r-project.org' Subject: [R] Grep with wildcards across multiple columns I have a fairly large data set with six variables set up like the following dummy: # Create fake data df - data.frame(code = c(rep(1001, 8), rep(1002, 8)), year = rep(c(rep(2011, 4), rep(2012, 4)), 2), fund = rep(c(10E, 10E, 10E, 27E), 4), func = rep(c(11, 122000, 214000, 158000), 4), obj = rep(100, 16), amount = round(rnorm(16, 5, 1))) What I would like to do is sum the amount variable by code and year, filtering rows using different wildcard searches in each of three columns: 1?E in fund, 1?? in func, and ??? in obj. I'm OK turning these into regular expressions: # Set parameters par.fund - 10E; par.func - 10; par.obj - 000 par.fund - glob2rx(gsub(0, ?, par.fund)) par.func - glob2rx(gsub(0, ?, par.func)) par.obj - glob2rx(gsub(0, ?, par.obj)) The problem occurs when I try to apply multiple greps across columns. I'd prefer to use data.table since it's
Re: [R] question about nls
As Gabor indicates, using a start based on a good approximation is usually helpful, and nls() will generally find solutions to problems where there are such starts, hence the SelfStart methods. The Marquardt approaches are more of a pit-bull approach to the original specification. They grind away at the problem without much finesse, but generally get there eventually. If one is solving lots of problems of a similar type, good starts are the way to go. One-off (or being lazy), I like Marquardt. It would be interesting to know what proportion of random starting points in some reasonable bounding box get the singular gradient message or other early termination with nls() vs. a Marquardt approach, especially as this is a tiny problem. This is just one example of the issue R developers face in balancing performance and robustness. The GN method in nls() is almost always a good deal more efficient than Marquardt approaches when it works, but suffers from a fairly high failure rate. JN On 13-03-15 10:01 AM, Gabor Grothendieck wrote: On Fri, Mar 15, 2013 at 9:45 AM, Prof J C Nash (U30A) nas...@uottawa.ca wrote: Actually, it likely won't matter where you start. The Gauss-Newton direction is nearly always close to 90 degrees from the gradient, as seen by turning trace=TRUE in the package nlmrt function nlxb(), which does a safeguarded Marquardt calculation. This can be used in place of nls(), except you need to put your data in a data frame. It finds a solution pretty straightforwardly, though with quite a few iterations and function evaluations. Interesting observation but it does converge in 5 iterations with the improved starting value whereas it fails due to a singular gradient with the original starting value. Lines - + xy + 60 0.8 + 80 6.5 + 100 20.5 + 120 45.9 + DF - read.table(text = Lines, header = TRUE) # original starting value - singular gradient nls(y ~ exp(a + b*x)+d,DF,start=list(a=0,b=0,d=1)) Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates # better starting value - converges in 5 iterations lm1 - lm(log(y) ~ x, DF) st - setNames(c(coef(lm1), 0), c(a, b, d)) nls(y ~ exp(a + b*x)+d, DF, start=st) Nonlinear regression model model: y ~ exp(a + b * x) + d data: DF a b d -0.1492 0.0342 -6.1966 residual sum-of-squares: 0.5743 Number of iterations to convergence: 5 Achieved convergence tolerance: 6.458e-07 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combinations
HI, Try this: T1- paste0(c,1:5) T2- paste0(k,1:5) as.vector(outer(T1,T2,paste,sep=,)) # [1] c1,k1 c2,k1 c3,k1 c4,k1 c5,k1 c1,k2 c2,k2 c3,k2 c4,k2 #[10] c5,k2 c1,k3 c2,k3 c3,k3 c4,k3 c5,k3 c1,k4 c2,k4 c3,k4 #[19] c4,k4 c5,k4 c1,k5 c2,k5 c3,k5 c4,k5 c5,k5 #or paste((,as.vector(outer(T1,T2,paste,sep=,)),),sep=) #[1] (c1,k1) (c2,k1) (c3,k1) (c4,k1) (c5,k1) (c1,k2) (c2,k2) #[8] (c3,k2) (c4,k2) (c5,k2) (c1,k3) (c2,k3) (c3,k3) (c4,k3) #[15] (c5,k3) (c1,k4) (c2,k4) (c3,k4) (c4,k4) (c5,k4) (c1,k5) #[22] (c2,k5) (c3,k5) (c4,k5) (c5,k5) A.K. - Original Message - From: Amir adare...@uottawa.ca To: r-help@r-project.org Cc: Sent: Friday, March 15, 2013 9:22 AM Subject: [R] Combinations Hi every one, I have two sets T1={c1,c2,..,cn} and T2={k1,k2,...,kn}. How can I find the sets as follow: (c1,k1), (c1,k2) ...(c1,kn) (c2,k1) (c2,k2) (c2,kn) ... (cn,kn) Thanks. Amir -- __ Amir Darehshoorzadeh | Computer Engineering Department PostDoc Fellow | University of Ottawa, PARADISE LAb Email: adare...@uottawa.ca | 800 King Edward Ave Tel: - | ON K1N 6N5, Ottawa - CANADA http://personals.ac.upc.edu/amir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combinations
At Fri, 15 Mar 2013 09:22:15 -0400, Amir wrote: I have two sets T1={c1,c2,..,cn} and T2={k1,k2,...,kn}. How can I find the sets as follow: (c1,k1), (c1,k2) ...(c1,kn) (c2,k1) (c2,k2) (c2,kn) ... (cn,kn) I think you are looking for expand.grid: expand.grid(1:3, 10:13) Var1 Var2 1 1 10 2 2 10 3 3 10 4 1 11 ... Neal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combinations
Is ?expand.grid what you are looking for? Rgds, Rainer On Friday 15 March 2013 09:22:15 Amir wrote: Hi every one, I have two sets T1={c1,c2,..,cn} and T2={k1,k2,...,kn}. How can I find the sets as follow: (c1,k1), (c1,k2) ...(c1,kn) (c2,k1) (c2,k2) (c2,kn) ... (cn,kn) Thanks. Amir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data manipulation
I was too quick on the Send button. Xtabs produces a table. If you want a data.frame, it would be data.frame(xtabs(Count~Class+X, D)): # Match John's summary table and generate Counts set.seed(42) Count - sample(1:50, 23) Class - c(rep(1, 4), rep(2, 7), 3, rep(1, 3), rep(3, 4), rep(3, 4)) X - c(rep(.1, 12), rep(.2, 7), rep(.3, 4)) D - data.frame(Class=factor(Class), X=factor(X), Count) table(D$Class, D$X) 0.1 0.2 0.3 1 4 3 0 2 7 0 0 3 1 4 4 # Create the table/data.frame D.table - xtabs(Count~Class+X) D.table X Class 0.1 0.2 0.3 1 150 63 0 2 169 0 0 3 41 98 114 D.df - data.frame(D.table) D.df Class X Freq 1 1 0.1 150 2 2 0.1 169 3 3 0.1 41 4 1 0.2 63 5 2 0.20 6 3 0.2 98 7 1 0.30 8 2 0.30 9 3 0.3 114 -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of David L Carlson Sent: Friday, March 15, 2013 9:23 AM To: 'IOANNA'; 'John Kane'; 'Blaser Nello'; r-help@r-project.org Subject: Re: [R] Data manipulation Wouldn't this do the same thing? xtabs(Count~Class+X, D) -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of IOANNA Sent: Friday, March 15, 2013 8:51 AM To: 'John Kane'; 'Blaser Nello'; r-help@r-project.org Subject: Re: [R] Data manipulation Thanks a lot! -Original Message- From: John Kane [mailto:jrkrid...@inbox.com] Sent: 15 March 2013 13:41 To: Blaser Nello; IOANNA; r-help@r-project.org Subject: Re: [R] Data manipulation Nice. That does look like it. IOANNA? John Kane Kingston ON Canada -Original Message- From: nbla...@ispm.unibe.ch Sent: Fri, 15 Mar 2013 14:27:03 +0100 To: ii54...@msn.com, r-help@r-project.org Subject: Re: [R] Data manipulation Is this what you want to do? D2 - expand.grid(Class=unique(D$Class), X=unique(D$X)) D2 - merge(D2, D, all=TRUE) D2$Count[is.na(D2$Count)] - 0 W - aggregate(D2$Count, list(D2$Class, D2$X), sum) W Best, Nello -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of IOANNA Sent: Freitag, 15. März 2013 13:41 To: r-help@r-project.org Subject: [R] Data manipulation Hello all, I would appreciate your thoughts on a seemingly simple problem. I have a database, where each row represent a single record. I want to aggregate this database so I use the aggregate command : D-read.csv(C:\\Users\\test.csv) attach(D) by1-factor(Class) by2-factor(X) W-aggregate(x=Count,by=list(by1,by2),FUN=sum) The results I get following the form: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 5 3 0.2 4 6 3 0.3 4 However, what I really want is an aggregation which includes the zero values, i.e.: W Group.1 Group.2 x 1 1 0.1 4 2 2 0.1 7 3 3 0.1 1 4 1 0.2 3 2 0.2 0 5 3 0.2 4 10.3 0 20.3 0 6 3 0.3 4 How can I achieve what I want? Best regards, Ioanna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and
Re: [R] metafor - multivariate analysis
Dear Owen, What is your definition of multivariate analysis? Do you mean: A meta-regression model with more than one predictor/moderator? In that case, yes, metafor handles that. Usually, this is referred to as multiple regression (as opposed to simple regression with a single predictor) -- and in the case of a meta-analysis, I guess one could call it multiple meta-regression. If you are referring to a model that handles statistical dependencies in the observed outcomes (and hence requires multivariate methods), then you will have to use some other package (e.g., mvmeta). See also: http://stats.stackexchange.com/questions/2358/explain-the-difference-between-multiple-regression-and-multivariate-regression Best, Wolfgang -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Owen, Branwen Sent: Friday, March 15, 2013 11:23 To: r-help@r-project.org Subject: [R] metafor - multivariate analysis Dear Metafor users, I'm conducting a metaanalysis of prevalence of a particular behaviour based on someone elses' code. I've been labouring under the impression that this: summary(rma.1- rma(yi,vi,mods=cbind(approxmeanage,interviewmethodcode),data=mal,method=D L,knha=F,weighted=F,intercept=T)) is doing the multivariate analysis that i want, but have read that multivariate analysis can't be done in metafor. this is the output: Mixed-Effects Model (k = 22; tau^2 estimator: DL) logLik Deviance AIC BIC 18.7726 -37.5452 -27.5452 -22.0899 tau^2 (estimate of residual amount of heterogeneity): 0.0106 tau (sqrt of the estimate of residual heterogeneity): 0.1031 Test for Residual Heterogeneity: QE(df = 18) = 1273.9411, p-val .0001 Test of Moderators (coefficient(s) 2,3,4): QM(df = 3) = 11.0096, p-val = 0.0117 Model Results: estimate se zval pval ci.lb ci.ub intrcpt 0.4014 0.1705 2.3537 0.0186 0.0671 0.7356 * continent -0.0206 0.0184 -1.1200 0.2627 -0.0568 0.0155 approxmeanage 0.0076 0.0091 0.8354 0.4035 -0.0102 0.0254 interviewmethodcode -0.0892 0.0273 -3.2702 0.0011 -0.1426 -0.0357 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 my questions are 1. what is this line of code? 2. if it isn't multivariate analysis, will i have to use the mvmeta instead. thanks very much for any help Branwen .http://r.789695.n4.nabble.com/metafor-multivariate-analysis- td4661233.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Normalized 2-D cross-correlation
Hi all, I need to do (normalized) 2-D cross-correlation in R. There is a convenient function available in Matlab (see: http://www.mathworks.de/de/help/images/ref/normxcorr2.html). Is there anything comparable in R available? Thanks, Felix [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reassign values based on multiple conditions
This might get you started, but more data is needed to test this # First create the data Collars - data.frame(collar=1:5, date=as.POSIXlt(c(01/01/2013, 02/01/2013, 04/01/2013, 04/01/2013, 07/01/2013), format=%m/%d/%y), data=letters[c(24:26, 1:2)]) Animals - data.frame(animal=c(1, 1, 1, 2, 2), collar=c(1, 5, 3, 2, 1), start_date=as.POSIXlt(c(01/01/2013, 04/01/2013, 07/01/2013, 01/01/2013, 04/01/2013), format=%m/%d/%y), end_date=as.POSIXlt(c(03/01/2013, 06/01/2013, 09/01/2013, 03/01/2013, 06/01/2013), format=%m/%d/%y)) Now you want to look at each row in Collars and find the number of the animal that it represents in Animals: AC - rep(NA, nrow(Collars)) for (i in 1:nrow(Collars)) { +AC[i] - Animals$animal[Animals$collar==Collars$collar[i] + Animals$start_date=Collars$date[i] Animals$end_date=Collars$date[i]] + } Error in AC[i] - Animals$animal[Animals$collar == Collars$collar[i] : replacement has length zero AC [1] 1 2 NA NA NA AC is the vector of animal numbers that has the same length as the Collars data.frame. In this case only the first two rows in Collars match anything in Animals so the rest are NA and R prints an error message. -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of John Kane Sent: Friday, March 15, 2013 9:07 AM To: Cat Cowie; r-help@r-project.org Subject: Re: [R] Reassign values based on multiple conditions I don't see how the data in the three column table you present is enough to produce the four column test. Should the first table actually show repeated collar usage so that you can use the next incidence of the collar as the end date e.g 1 01/01/2013 1 02/04/2013 and so on? Some actual data might be useful. The easiest way to supply data is to use the dput() function. Example with your file named testfile: dput(testfile) Then copy the output and paste into your email. For large data sets, you can just supply a representative sample. Usually, dput(head(testfile, 100)) will be sufficient. Sorry I'm not more helpful John Kane Kingston ON Canada -Original Message- From: cat.e.co...@gmail.com Sent: Fri, 15 Mar 2013 12:46:13 +0800 To: r-help@r-project.org Subject: [R] Reassign values based on multiple conditions Hi all, I have a simple data frame of three columns - one of numbers (really a categorical variable), one of dates and one of data. Imagine: collar date data 1 01/01/2013 x 2 02/01/2013 y 3 04/01/2013 z 4 04/01/2013 a 5 07/01/2013 b The 'collar' is a GPS collar that's been worn by an animal for a certain amount of time, and then may have been worn by a different animal after changes when the batteries needed to be changed. When an animal was caught and the collar battery needed to be changed, a whole new collar had to be put on, as these animals (wild boar and red deer!) were not that easy to catch. In order to follow the movements of each animal I now need to create a new column that assigns the 'data' by animal rather than by collar. I have a table of dates, e.g animal collar start_dateend_date 1 1 01/01/2013 03/01/2013 1 5 04/01/2013 06/01/2013 1 3 07/01/2013 09/01/2013 2 2 01/01/2013 03/01/2013 2 1 04/01/2013 06/01/2013 I have so far been able to make multi-conditional tests: animal1test- (date=01/01/13 date=03/01/13) animal1test2- (date=04/01/13 date=06/01/13) animal2test- (date=04/01/13 date=06/01/13) to use in an 'if else' formula: if(animal1test){ collar[1]=animal1 } else if(animal1test2){ collar[5]=animal1 }else if(animal2test) collar[1]=animal2 }else NA As I'm sure you can see, this is completely inelegant, and also not working for me! Any ideas on how to a achieve this? Thanks SO much in advance, Cat [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented,
[R] missing values in an array
Dear All, I've an array with some missing values (NA) in between. I want to remove that particular matrix if a missing value is detected. How can I do so? Thank you very much. Best regards, Ray [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] missing values in an array
HI, Try this: set.seed(25) arr1- array(sample(c(1:40,NA),60,replace=TRUE),dim=c(5,4,3)) arr1[,,sapply(seq(dim(arr1)[3]),function(i) all(!is.na(arr1[,,i])))] # [,1] [,2] [,3] [,4] #[1,] 2 13 34 17 #[2,] 19 3 15 39 #[3,] 4 25 10 16 #[4,] 7 22 5 7 #[5,] 12 10 35 6 #2nd case set.seed(46) arr2- array(sample(c(1:40,NA),60,replace=TRUE),dim=c(5,4,3)) arr2[,,sapply(seq(dim(arr2)[3]),function(i) all(!is.na(arr2[,,i])))] #, , 1 # [,1] [,2] [,3] [,4] #[1,] 8 27 11 28 #[2,] 10 37 5 40 #[3,] 24 25 28 6 #[4,] 15 37 3 25 #[5,] 10 39 32 23 #, , 2 # [,1] [,2] [,3] [,4] #[1,] 14 2 8 27 #[2,] 10 39 37 4 #[3,] 9 36 15 6 #[4,] 33 16 20 32 #[5,] 21 6 28 15 A.K. - Original Message - From: Ray Cheung ray1...@gmail.com To: R help r-help@r-project.org Cc: Sent: Friday, March 15, 2013 12:08 PM Subject: [R] missing values in an array Dear All, I've an array with some missing values (NA) in between. I want to remove that particular matrix if a missing value is detected. How can I do so? Thank you very much. Best regards, Ray [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to make the labels of pie chart are not overlapping?
Simple -- don't make a pie chart. -- Bert (Seriously -- this is an awful display. Consider, instead, a bar plot plotting cumulative sums of percentages with products/bars ordered from largest percentage to smallest; or plotting just the percentages in that order, depending on which is more informative.) On Fri, Mar 15, 2013 at 6:58 AM, Tammy Ma metal_lical...@live.com wrote: I have the following dataframe: Productpredicted_MarketShare Predicted_MS_Percentage A2.827450e-02 2.8 B4.716403e-06 0.0 C1.741686e-01 17.4 D 1.716303e-04 0.0 ... Because there are so many products, and most of predicted Market share is around 0%. When I make pie chart, the labels of those product with 0% market share are overlapping. How do I make the labels are not overlapping? Kind regards. Tammy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] missing values in an array
On 15-03-2013, at 17:08, Ray Cheung ray1...@gmail.com wrote: Dear All, I've an array with some missing values (NA) in between. I want to remove that particular matrix if a missing value is detected. How can I do so? Thank you very much. It is not clear what the dimension of your array is. If your array/matrix is two dimensional, then then any(is.na(A)) # A is the name of the array/matrix will return TRUE is at least one element of A is NA. And then you can delete A. If you array has three dimensions then you'll have to look at arun's solution. Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to make the labels of pie chart are not overlapping?
On Fri, Mar 15, 2013 at 12:30 PM, Bert Gunter gunter.ber...@gene.com wrote: Simple -- don't make a pie chart. This is great advice. But it you (or your boss) insists on pie charts, then you should provide us with a reproducible example that illustrates your problem. dat - read.table(text=Productpredicted_MarketShare Predicted_MS_Percentage A2.827450e-02 2.8 B4.716403e-06 0.0 C1.741686e-01 17.4 D 1.716303e-04 0.0, header=TRUE) pie(dat[[2]], labels=dat[[1]]) Does not give overlapping labels, so I don't yet have an example of the problem you are trying to solve. Best, Ista -- Bert (Seriously -- this is an awful display. Consider, instead, a bar plot plotting cumulative sums of percentages with products/bars ordered from largest percentage to smallest; or plotting just the percentages in that order, depending on which is more informative.) On Fri, Mar 15, 2013 at 6:58 AM, Tammy Ma metal_lical...@live.com wrote: I have the following dataframe: Productpredicted_MarketShare Predicted_MS_Percentage A2.827450e-02 2.8 B4.716403e-06 0.0 C1.741686e-01 17.4 D 1.716303e-04 0.0 ... Because there are so many products, and most of predicted Market share is around 0%. When I make pie chart, the labels of those product with 0% market share are overlapping. How do I make the labels are not overlapping? Kind regards. Tammy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help finding first value in a BY group
I have a large Excel file with SKU numbers (stock keeping units) and forecasts which can be mimicked with the following: Period - c(1, 2, 3, 1, 2, 3, 4, 1, 2) SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2) Forecast - c(99, 103, 128, 63, 69, 72, 75, 207, 201) PeriodSKUForecast - data.frame(Period, SKU, Forecast) PeriodSKUForecast Period SKU Forecast 1 1 A1 99 2 2 A1 103 3 3 A1 128 4 1 X4 63 5 2 X4 69 6 3 X4 72 7 4 X4 75 8 1 K2 207 9 2 K2 201 I need to create a matrix with only the first forecast for each SKU: A1 99 X4 63 K2 207 The Period for the first forecast will always be the minimum value for an SKU. Can anyone suggest how I might accomplish this? Thank you, -- __ *Barry E. King, Ph.D.* Director of Retail Operations Qualex Consulting Services, Inc. barry.k...@qlx.com O: (317)940-5464 M: (317)507-0661 __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help finding first value in a BY group
Hi, Try: data.frame(Forecast=with(PeriodSKUForecast,tapply(Forecast,SKU,head,1))) # Forecast #A1 99 #K2 207 #X4 63 #or aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1) # SKU Forecast #1 A1 99 #2 K2 207 #3 X4 63 #or library(plyr) ddply(PeriodSKUForecast,.(SKU),summarise, Forecast=head(Forecast,1)) # SKU Forecast #1 A1 99 #2 K2 207 #3 X4 63 A.K. - Original Message - From: Barry King barry.k...@qlx.com To: r-help@r-project.org Cc: Sent: Friday, March 15, 2013 1:30 PM Subject: [R] Help finding first value in a BY group I have a large Excel file with SKU numbers (stock keeping units) and forecasts which can be mimicked with the following: Period - c(1, 2, 3, 1, 2, 3, 4, 1, 2) SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2) Forecast - c(99, 103, 128, 63, 69, 72, 75, 207, 201) PeriodSKUForecast - data.frame(Period, SKU, Forecast) PeriodSKUForecast Period SKU Forecast 1 1 A1 99 2 2 A1 103 3 3 A1 128 4 1 X4 63 5 2 X4 69 6 3 X4 72 7 4 X4 75 8 1 K2 207 9 2 K2 201 I need to create a matrix with only the first forecast for each SKU: A1 99 X4 63 K2 207 The Period for the first forecast will always be the minimum value for an SKU. Can anyone suggest how I might accomplish this? Thank you, -- __ *Barry E. King, Ph.D.* Director of Retail Operations Qualex Consulting Services, Inc. barry.k...@qlx.com O: (317)940-5464 M: (317)507-0661 __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiple filled.contour() on single page
Hi, It seems as if filled.contour can't be used along with layout(), or par(mfrow) or the like, since it sets the page in a very particular manner. Someone posted a workaround (http://r.789695.n4.nabble.com/several-Filled-contour-plots-on-the-same-device-td819040.html). Has a better approach been developed for achieving this? Thanks. Cheers, -- Seb __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiple frequencies per second again
Dear R People: I have the following situation. I have observations that are 128 samples per second, which is fine. I want to fit them with ARIMA models, also fine. My question is, please: when I do my forecasting, do I need to do anything special to the n.ahead parm, please? Here is the initial setup: xx - ts(rnorm(128),start=0,freq=128) str(xx) Time-Series [1:128] from 0 to 0.992: -1.07 0.498 1.508 0.354 -0.497 ... xx.ar - arima(xx,order=c(1,0,0)) str(xx.ar) List of 13 $ coef : Named num [1:2] -0.0818 0.0662 ..- attr(*, names)= chr [1:2] ar1 intercept $ sigma2 : num 1.06 $ var.coef : num [1:2, 1:2] 7.78e-03 -5.09e-05 -5.09e-05 7.07e-03 ..- attr(*, dimnames)=List of 2 .. ..$ : chr [1:2] ar1 intercept .. ..$ : chr [1:2] ar1 intercept $ mask : logi [1:2] TRUE TRUE $ loglik : num -185 $ aic : num 376 $ arma : int [1:7] 1 0 0 0 128 0 0 $ residuals: Time-Series [1:128] from 0 to 0.992: -1.133 0.338 1.477 0.406 -0.54 ... $ call : language arima(x = xx, order = c(1, 0, 0)) $ series : chr xx $ code : int 0 $ n.cond : int 0 $ model:List of 10 ..$ phi : num -0.0818 ..$ theta: num(0) ..$ Delta: num(0) ..$ Z: num 1 ..$ a: num 0.156 ..$ P: num [1, 1] 0 ..$ T: num [1, 1] -0.0818 ..$ V: num [1, 1] 1 ..$ h: num 0 ..$ Pn : num [1, 1] 1 - attr(*, class)= chr Arima predict(xx.ar,n.ahead=3) $pred Time Series: Start = c(1, 1) End = c(1, 3) Frequency = 128 [1] 0.05346814 0.06728105 0.06615104 $se Time Series: Start = c(1, 1) End = c(1, 3) Frequency = 128 [1] 1.028302 1.031737 1.031760 Thanks for any help. Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Add a continuous color ramp legend to a 3d scatter plot
Marc, Thank you very much for your reply! It helps tremendously! best, Z On Fri, Mar 15, 2013 at 2:37 AM, Marc Girondot marc_...@yahoo.fr wrote: Le 14/03/13 18:15, Zhuoting Wu a écrit : I have two follow-up questions: 1. If I want to reverse the heat.colors (i.e., from yellow to red instead of red to yellow), is there a way to do that? nbcol - heat.colors(128) nbcol - nbcol[128:1] 2. I also created this interactive 3d scatter plot as below: library(rgl) plot3d(x=x, y=y, z=z, col=nbcol[zcol], box=FALSE) I have never use such a plot. Sorry Marc Is there any way to add the same legend to this 3d plot? I'm new to R and try to learn it. I'm very grateful for any help! thanks, Z -- __** Marc Girondot, Pr Laboratoire Ecologie, Systématique et Evolution Equipe de Conservation des Populations et des Communautés CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079 BÄtiment 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1.69.15.73.53 e-mail: marc.giron...@u-psud.fr Web: http://www.ese.u-psud.fr/epc/**conservation/Marc.htmlhttp://www.ese.u-psud.fr/epc/conservation/Marc.html Skype: girondot [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] seeking tip to keep first of multiple observations per ID
Dear R community, I am a neophyte and I cannot figure out how to accomplish keeping only the first record for each ID in a data.frame that has assorted numbers of records per ID. I studied and found references to packages plyr and sql for R, and I fear the documentation for those was over my head and I could not identify what may be there to reach my goal. If someone could point me toward a method I will gladly study documentation, or if there is an example posted someplace I will follow it. THANKS! Julie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nlrob and robust nonlinear regression with upper and/or lower bounds on parameters
I have a question regarding robust nonlinear regression with nlrob. I would like to place lower bounds on the parameters, but when I call nlrob with limits it returns the following error: Error in psi(resid/Scale, ...) : unused argument(s) (lower = list(Asym = 1, mid = 1, scal = 1)) After consulting the documentation I noticed that upper and lower are not listed as parameter in the nlrob help documentation. I haven't checked the source to confirm this yet, but I infer that nlrob simply doesn't support upper and lower bounds. For my current problem, I only require that the parameters be positive, so I simply rewrote the formula to be a function of the absolute value of the parameter. However, I have other problems where I am not so lucky. Are there robust nonlinear regression methods that support upper and lower bounds? Or am I simply missing something with nlrob? I've included example code that should illustrate the issue. require(stats) require(robustbase) Dat - NULL; Dat$x - rep(1:25, 20) set.seed(1) Dat$y - SSlogis(Dat$x, 10, 12, 2)*rnorm(500, 1, 0.1) plot(Dat) Dat.nls - nls(y ~ SSlogis(x, Asym, mid, scal), data=Dat,start=list(Asym=1,mid=1,scal=1),lower=list(Asym=1,mid=1,scal=1)); Dat.nls lines(1:25, predict(Dat.nls, newdata=list(x=1:25)), col=1) Dat.nlrob - nlrob(y ~ SSlogis(x, Asym, mid, scal), data=Dat,start=list(Asym=1,mid=1,scal=1)); Dat.nlrob lines(1:25, predict(Dat.nlrob, newdata=list(x=1:25)), col=2) Dat.nlrob - nlrob(y ~ SSlogis(x, Asym, mid, scal), data=Dat,start=list(Asym=1,mid=1,scal=1),lower=list(Asym=1,mid=1,scal=1)); Dat.nlrob thanks, Shane __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Spearman rank correlation
Hi If I get a p-value less than 0.05 does that mean there is a significant relation between the 2 ranked lists? Sometimes I get a low correlation such as 0.3 or even 0.2 and the p-value is so low , such as 0.01 , does that mean it is significant also? and would that be interpreted as significant low positive correlation or significant moderate positive correlation? Also,can R calculate the results for lists 30 for spearman or I need to shift to pearson correlation in that case? and finally what does the (S=) in the results mean? Many thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merge two matrices
Dear R-help members I would be grateful if anyone could help me with the following problem: I would like to combine two matrices (SCH_15 and SCH_16, they are attached) which have a species presence/absence x sampling plot structure. The aim would be to have in the end only one matrix which shows all existing species and their presence/absence on all the different plots(an_1, an_2 etc.) To do this I used the merge function in R. Command: output-merge(SCH_15,SCH_16,by=species, all=TRUE) The problem is that if the same species occurs in both SCH files (i.e. species Abutilon longicuspe occurs in both files) it is listed two times in the merged matrix. However, the aim would be that each species is listed only once in the final matrix. How do I have to modify the R code? I guess I have to replace all=TRUE with something else but I can't figure out what it is. Thank you for your help Michael __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help finding first value in a BY group
Hi, There is a potential gotcha with the approach of using head(..., 1) in each of the solutions that Arun has below, which is the assumption that the data is sorted, as is the case in the example data. It seems reasonable to consider that the real data at hand may not be entered in order or presorted. If the data is not sorted (switching the order of the two K2 related entries): Period - c(1, 2, 3, 1, 2, 3, 4, 2, 1) Forecast - c(99, 103, 128, 63, 69, 72, 75, 201, 207) SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2) PeriodSKUForecast - data.frame(Period, SKU, Forecast) PeriodSKUForecast Period SKU Forecast 1 1 A1 99 2 2 A1 103 3 3 A1 128 4 1 X4 63 5 2 X4 69 6 3 X4 72 7 4 X4 75 8 2 K2 201 9 1 K2 207 with(PeriodSKUForecast,tapply(Forecast,SKU,head,1)) A1 K2 X4 99 201 63 aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1) SKU Forecast 1 A1 99 2 K2 201 3 X4 63 Note that the wrong value for K2 is returned. You would either have to pre-sort the data frame before using these approaches: NewDF - PeriodSKUForecast[with(PeriodSKUForecast, order(SKU, Period)), ] NewDF Period SKU Forecast 1 1 A1 99 2 2 A1 103 3 3 A1 128 9 1 K2 207 8 2 K2 201 4 1 X4 63 5 2 X4 69 6 3 X4 72 7 4 X4 75 with(NewDF,tapply(Forecast,SKU,head,1)) A1 K2 X4 99 207 63 Or consider an approach that does not depend upon the sort order, but which subsets based upon the minimum value of Period for each SKU: do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU), function(x) x[which.min(x$Period), ])) Period SKU Forecast A1 1 A1 99 K2 1 K2 207 X4 1 X4 63 or remove the Period column if you don't want it: do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU), function(x) x[which.min(x$Period), -1])) SKU Forecast A1 A1 99 K2 K2 207 X4 X4 63 Regards, Marc Schwartz On Mar 15, 2013, at 12:37 PM, arun smartpink...@yahoo.com wrote: Hi, Try: data.frame(Forecast=with(PeriodSKUForecast,tapply(Forecast,SKU,head,1))) # Forecast #A1 99 #K2 207 #X4 63 #or aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1) # SKU Forecast #1 A1 99 #2 K2 207 #3 X4 63 #or library(plyr) ddply(PeriodSKUForecast,.(SKU),summarise, Forecast=head(Forecast,1)) # SKU Forecast #1 A1 99 #2 K2 207 #3 X4 63 A.K. - Original Message - From: Barry King barry.k...@qlx.com To: r-help@r-project.org Cc: Sent: Friday, March 15, 2013 1:30 PM Subject: [R] Help finding first value in a BY group I have a large Excel file with SKU numbers (stock keeping units) and forecasts which can be mimicked with the following: Period - c(1, 2, 3, 1, 2, 3, 4, 1, 2) SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2) Forecast - c(99, 103, 128, 63, 69, 72, 75, 207, 201) PeriodSKUForecast - data.frame(Period, SKU, Forecast) PeriodSKUForecast Period SKU Forecast 1 1 A1 99 2 2 A1 103 3 3 A1 128 4 1 X4 63 5 2 X4 69 6 3 X4 72 7 4 X4 75 8 1 K2 207 9 2 K2 201 I need to create a matrix with only the first forecast for each SKU: A1 99 X4 63 K2 207 The Period for the first forecast will always be the minimum value for an SKU. Can anyone suggest how I might accomplish this? Thank you, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] seeking tip to keep first of multiple observations per ID
Probably the first thing to do is supply some sample data See https://github.com/hadley/devtools/wiki/Reproducibility for some suggestions. However you may want to take a look at http://stackoverflow.com/questions/13279582/select-only-the-first-rows-for-each-unique-value-of-a-column-in-r particularly at answer number 3 which uses the data.table package and which looks like it may do what you want. John Kane Kingston ON Canada -Original Message- From: jsdroys...@bellsouth.net Sent: Fri, 15 Mar 2013 12:06:05 -0400 To: r-help@r-project.org Subject: [R] seeking tip to keep first of multiple observations per ID Dear R community, I am a neophyte and I cannot figure out how to accomplish keeping only the first record for each ID in a data.frame that has assorted numbers of records per ID. I studied and found references to packages plyr and sql for R, and I fear the documentation for those was over my head and I could not identify what may be there to reach my goal. If someone could point me toward a method I will gladly study documentation, or if there is an example posted someplace I will follow it. THANKS! Julie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spearman rank correlation
Rhelp is not here to fill in the gaps in your statistical education. You may want to try CrossValidated, but even there you would be expected to do some searching both of their website and in you textbooks. On Mar 15, 2013, at 8:16 AM, Zia mel wrote: Hi If I get a p-value less than 0.05 does that mean there is a significant relation between the 2 ranked lists? Sometimes I get a low correlation such as 0.3 or even 0.2 and the p-value is so low , such as 0.01 , does that mean it is significant also? and would that be interpreted as significant low positive correlation or significant moderate positive correlation? Also,can R calculate the results for lists 30 for spearman or I need to shift to pearson correlation in that case? and finally what does the (S=) in the results mean? Many thanks David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple filled.contour() on single page
On Mar 15, 2013, at 10:49 AM, Sebastian P. Luque wrote: Hi, It seems as if filled.contour can't be used along with layout(), or par(mfrow) or the like, since it sets the page in a very particular manner. Someone posted a workaround (http://r.789695.n4.nabble.com/several-Filled-contour-plots-on-the-same-device-td819040.html). Has a better approach been developed for achieving this? Thanks. I remember seeing a similar question posted last week that got an informative answer (both here and in the crossposting to SO) . Since you are not describing what you what to change it is difficult to know what will be a satisfying answer, but at the very least you should do a bit more searching. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Source Code Plagiarism Detection
For the classes I teach at the University of Washington, we use: http://www.compsoftbook.com. It's an automated scientific computing grading system that supports R and includes MOSS checking. However, I would also be very interested in alternative and possibly stand-alone MOSS-type tools that can work with R source code. -- View this message in context: http://r.789695.n4.nabble.com/R-Source-Code-Plagiarism-Detection-tp4660803p4661527.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] seeking tip to keep first of multiple observations per ID
Hi, If you can dput() a small part of your dataset e.g. dput(head(yourdataset),20)), it would be helpful. Otherwise, dat1- data.frame(ID=rep(1:3,times=c(3,4,2)),col2=rnorm(9)) aggregate(.~ID,data=dat1,head,1) # ID col2 #1 1 -0.0637622 #2 2 1.1782429 #3 3 0.4670021 A.K. - Original Message - From: Julie Royster jsdroys...@bellsouth.net To: r-help@r-project.org Cc: Sent: Friday, March 15, 2013 12:06 PM Subject: [R] seeking tip to keep first of multiple observations per ID Dear R community, I am a neophyte and I cannot figure out how to accomplish keeping only the first record for each ID in a data.frame that has assorted numbers of records per ID. I studied and found references to packages plyr and sql for R, and I fear the documentation for those was over my head and I could not identify what may be there to reach my goal. If someone could point me toward a method I will gladly study documentation, or if there is an example posted someplace I will follow it. THANKS! Julie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] latex(test, collabel=) returns wrong latex code?
Hello: I'm working with a 2-dimensional table that looks sort of like test below. I'm trying to produce latex code that will add dimension names for both the rows and the columns. In using the following code, latex chokes when I include collabel='Vote' but it's fine without it. The code below prouces the latex code further below. I'm confused by this, because it looks like it's creating two bits of text for each instance of \multicolumn. Is that really allowed in \multicolumn? Could someone clarify? Thank you! Yours, SJK library(Hmisc) test-as.table(matrix(c(50,50,50,50), ncol=2)) latex(test, rowlabel='Gender',collabel='Vote', file='') % latex.default(test, rowlabel = Gender, collabel = vote, file = ) % \begin{table}[!tbp] \begin{center} \begin{tabular}{lrr} \hline\hline \multicolumn{1}{l}{Gender}\multicolumn{1}{vote}{A}\multicolumn{1}{l}{B}\tabularnewline \hline A$50$$50$\tabularnewline B$50$$50$\tabularnewline \hline \end{tabular} \end{center} \end{table} * Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 905 746 7606 Please avoid sending me Word, PowerPoint or Excel attachments. Sending these documents puts pressure on many people to use Microsoft software and helps to deny them any other choice. In effect, you become a buttress of the Microsoft monopoly. To convert to plain text choose Text Only or Text Document as the Save As Type. Your computer may also have a program to convert to PDF format. Select File, then Print. Scroll through available printers and select the PDF converter. Click on the Print button and enter a name for the PDF file when requested. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] seeking tip to keep first of multiple observations per ID
I had the same problem. -- View this message in context: http://r.789695.n4.nabble.com/seeking-tip-to-keep-first-of-multiple-observations-per-ID-tp4661520p4661534.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get the sign of a value
And as a footnote to the other replies, see help('Math',package='base') R's online help has a number of topics that are broader than that of a single function, and that relatively new useRs might not have seen yet. Examples include ?Distributions (compare with ?rnorm) and ?Startup -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 3/14/13 12:41 PM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Hello! Can't figure it out - hope it's simple: # I have some value (can be anything), e.g.: x = 12 # I'd like it to become 1. # If the value is negative (again, it can be anything), e.g.: y = -12 # I'd like it to become -1. How could I do it? Thanks a lot! -- Dimitri Liakhovitski [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help finding first value in a BY group
ddply() is very handy, but sometimes it seems like overkill to select rows from a dataset by pulling into pieces, selecting a row from each piece, the pasting the pieces back together again. Information like row names can be lost. The following uses a subscript to pull out the rows of interest. We compute the subcript with ave(), which does the same sort of looping that things in plyr do, but it operates on an integer vector rather than the whole data.frame. w - with(PeriodSKUForecast, ave(Period, SKU, FUN=order)) PeriodSKUForecast[w==1,] Period SKU Forecast 1 1 A1 99 4 1 X4 63 9 1 K2 207 Note that the output rows are in the order they were in in the input data.frame and their row names come from the input also. If you want the first two periods for each SKU use the subscript w=2. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Marc Schwartz Sent: Friday, March 15, 2013 11:57 AM To: arun Cc: R help Subject: Re: [R] Help finding first value in a BY group Hi, There is a potential gotcha with the approach of using head(..., 1) in each of the solutions that Arun has below, which is the assumption that the data is sorted, as is the case in the example data. It seems reasonable to consider that the real data at hand may not be entered in order or presorted. If the data is not sorted (switching the order of the two K2 related entries): Period - c(1, 2, 3, 1, 2, 3, 4, 2, 1) Forecast - c(99, 103, 128, 63, 69, 72, 75, 201, 207) SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2) PeriodSKUForecast - data.frame(Period, SKU, Forecast) PeriodSKUForecast Period SKU Forecast 1 1 A1 99 2 2 A1 103 3 3 A1 128 4 1 X4 63 5 2 X4 69 6 3 X4 72 7 4 X4 75 8 2 K2 201 9 1 K2 207 with(PeriodSKUForecast,tapply(Forecast,SKU,head,1)) A1 K2 X4 99 201 63 aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1) SKU Forecast 1 A1 99 2 K2 201 3 X4 63 Note that the wrong value for K2 is returned. You would either have to pre-sort the data frame before using these approaches: NewDF - PeriodSKUForecast[with(PeriodSKUForecast, order(SKU, Period)), ] NewDF Period SKU Forecast 1 1 A1 99 2 2 A1 103 3 3 A1 128 9 1 K2 207 8 2 K2 201 4 1 X4 63 5 2 X4 69 6 3 X4 72 7 4 X4 75 with(NewDF,tapply(Forecast,SKU,head,1)) A1 K2 X4 99 207 63 Or consider an approach that does not depend upon the sort order, but which subsets based upon the minimum value of Period for each SKU: do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU), function(x) x[which.min(x$Period), ])) Period SKU Forecast A1 1 A1 99 K2 1 K2 207 X4 1 X4 63 or remove the Period column if you don't want it: do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU), function(x) x[which.min(x$Period), -1])) SKU Forecast A1 A1 99 K2 K2 207 X4 X4 63 Regards, Marc Schwartz On Mar 15, 2013, at 12:37 PM, arun smartpink...@yahoo.com wrote: Hi, Try: data.frame(Forecast=with(PeriodSKUForecast,tapply(Forecast,SKU,head,1))) # Forecast #A1 99 #K2 207 #X4 63 #or aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1) # SKU Forecast #1 A1 99 #2 K2 207 #3 X4 63 #or library(plyr) ddply(PeriodSKUForecast,.(SKU),summarise, Forecast=head(Forecast,1)) # SKU Forecast #1 A1 99 #2 K2 207 #3 X4 63 A.K. - Original Message - From: Barry King barry.k...@qlx.com To: r-help@r-project.org Cc: Sent: Friday, March 15, 2013 1:30 PM Subject: [R] Help finding first value in a BY group I have a large Excel file with SKU numbers (stock keeping units) and forecasts which can be mimicked with the following: Period - c(1, 2, 3, 1, 2, 3, 4, 1, 2) SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2) Forecast - c(99, 103, 128, 63, 69, 72, 75, 207, 201) PeriodSKUForecast - data.frame(Period, SKU, Forecast) PeriodSKUForecast Period SKU Forecast 1 1 A1 99 2 2 A1 103 3 3 A1 128 4 1 X4 63 5 2 X4 69 6 3 X4 72 7 4 X4 75 8 1 K2 207 9 2 K2 201 I need to create a matrix with only the first forecast for each SKU: A1 99 X4 63 K2 207 The Period for the first forecast will always be the minimum value for an SKU. Can anyone suggest how I might accomplish this? Thank you,
[R] Fw: Help finding first value in a BY group
Forgot to cc: to list - Forwarded Message - From: arun smartpink...@yahoo.com To: Marc Schwartz marc_schwa...@me.com Cc: Barry King barry.k...@qlx.com; Cc: Barry King barry.k...@qlx.com Sent: Friday, March 15, 2013 3:41 PM Subject: Re: [R] Help finding first value in a BY group Thanks Marc for catching that. You could also use ?ave() #unsorted PeriodSKUForecast[as.logical(with(PeriodSKUForecast,ave(Period,SKU,FUN=function(x) x==min(x,-1] # SKU Forecast #1 A1 99 #4 X4 63 #9 K2 207 #sorted NewDF[as.logical(with(NewDF,ave(Period,SKU,FUN=function(x) x==min(x,-1] #SKU Forecast #1 A1 99 #9 K2 207 #4 X4 63 A.K. - Original Message - From: Marc Schwartz marc_schwa...@me.com To: arun smartpink...@yahoo.com Cc: Barry King barry.k...@qlx.com; R help r-help@r-project.org Sent: Friday, March 15, 2013 2:56 PM Subject: Re: [R] Help finding first value in a BY group Hi, There is a potential gotcha with the approach of using head(..., 1) in each of the solutions that Arun has below, which is the assumption that the data is sorted, as is the case in the example data. It seems reasonable to consider that the real data at hand may not be entered in order or presorted. If the data is not sorted (switching the order of the two K2 related entries): Period - c(1, 2, 3, 1, 2, 3, 4, 2, 1) Forecast - c(99, 103, 128, 63, 69, 72, 75, 201, 207) SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2) PeriodSKUForecast - data.frame(Period, SKU, Forecast) PeriodSKUForecast Period SKU Forecast 1 1 A1 99 2 2 A1 103 3 3 A1 128 4 1 X4 63 5 2 X4 69 6 3 X4 72 7 4 X4 75 8 2 K2 201 9 1 K2 207 with(PeriodSKUForecast,tapply(Forecast,SKU,head,1)) A1 K2 X4 99 201 63 aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1) SKU Forecast 1 A1 99 2 K2 201 3 X4 63 Note that the wrong value for K2 is returned. You would either have to pre-sort the data frame before using these approaches: NewDF - PeriodSKUForecast[with(PeriodSKUForecast, order(SKU, Period)), ] NewDF Period SKU Forecast 1 1 A1 99 2 2 A1 103 3 3 A1 128 9 1 K2 207 8 2 K2 201 4 1 X4 63 5 2 X4 69 6 3 X4 72 7 4 X4 75 with(NewDF,tapply(Forecast,SKU,head,1)) A1 K2 X4 99 207 63 Or consider an approach that does not depend upon the sort order, but which subsets based upon the minimum value of Period for each SKU: do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU), function(x) x[which.min(x$Period), ])) Period SKU Forecast A1 1 A1 99 K2 1 K2 207 X4 1 X4 63 or remove the Period column if you don't want it: do.call(rbind, lapply(split(PeriodSKUForecast, PeriodSKUForecast$SKU), function(x) x[which.min(x$Period), -1])) SKU Forecast A1 A1 99 K2 K2 207 X4 X4 63 Regards, Marc Schwartz On Mar 15, 2013, at 12:37 PM, arun smartpink...@yahoo.com wrote: Hi, Try: data.frame(Forecast=with(PeriodSKUForecast,tapply(Forecast,SKU,head,1))) # Forecast #A1 99 #K2 207 #X4 63 #or aggregate(Forecast~SKU,data=PeriodSKUForecast,head,1) # SKU Forecast #1 A1 99 #2 K2 207 #3 X4 63 #or library(plyr) ddply(PeriodSKUForecast,.(SKU),summarise, Forecast=head(Forecast,1)) # SKU Forecast #1 A1 99 #2 K2 207 #3 X4 63 A.K. - Original Message - From: Barry King barry.k...@qlx.com To: r-help@r-project.org Cc: Sent: Friday, March 15, 2013 1:30 PM Subject: [R] Help finding first value in a BY group I have a large Excel file with SKU numbers (stock keeping units) and forecasts which can be mimicked with the following: Period - c(1, 2, 3, 1, 2, 3, 4, 1, 2) SKU - c(A1,A1,A1,X4,X4,X4,X4,K2,K2) Forecast - c(99, 103, 128, 63, 69, 72, 75, 207, 201) PeriodSKUForecast - data.frame(Period, SKU, Forecast) PeriodSKUForecast Period SKU Forecast 1 1 A1 99 2 2 A1 103 3 3 A1 128 4 1 X4 63 5 2 X4 69 6 3 X4 72 7 4 X4 75 8 1 K2 207 9 2 K2 201 I need to create a matrix with only the first forecast for each SKU: A1 99 X4 63 K2 207 The Period for the first forecast will always be the minimum value for an SKU. Can anyone suggest how I might accomplish this? Thank you, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to make the labels of pie chart are not overlapping?
On 03/16/2013 12:58 AM, Tammy Ma wrote: I have the following dataframe: Productpredicted_MarketShare Predicted_MS_Percentage A2.827450e-02 2.8 B4.716403e-06 0.0 C1.741686e-01 17.4 D 1.716303e-04 0.0 ... Because there are so many products, and most of predicted Market share is around 0%. When I make pie chart, the labels of those product with 0% market share are overlapping. How do I make the labels are not overlapping? Hi Tammy, Obviously you have many more products than are shown above. Let us assume that their market share is distributed approximately as negative binomial and your C value is the maximum. You might have twenty products with market shares around: market_share-c(0,0,0,0,1,1,1,1,2,2,3,3,4,5,5,6,10,11,15,17) names(market_share)-LETTERS[1:20] If you try to plot this as a pie chart: pie(market_share) you do get a bunch of overprinted labels for the four zero values. Pie charts with more than four or five sectors are usually not the best way to display the distribution of your values, but if you must: par(mar=c(5,4,4,4)) pie(market_share,labels=c(rep(,4),names(market_share)[5:20])) par(xpd=TRUE) text(1.1,0,A,B,C,D=0) par(xpd=FALSE) Good luck with it. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] latex(test, collabel=) returns wrong latex code?
Hi Simon the equivalent in xtable is library(xtable) xtable(test) % latex table generated in R 2.15.2 by xtable 1.7-0 package % Sat Mar 16 08:14:01 2013 \begin{table}[ht] \begin{center} \begin{tabular}{rrr} \hline A B \\ \hline A 50.00 50.00 \\ B 50.00 50.00 \\ \hline \end{tabular} \end{center} \end{table} I am wondering if the class is making things hard test A B A 50 50 B 50 50 # as a data.frame data.frame(test) Var1 Var2 Freq 1AA 50 2BA 50 3AB 50 4BB 50 # Add column names dimnames(test) - list(c(Gender A, Gender B), c(Vote A, Vote B)) test Vote A Vote B Gender A 50 50 Gender B 50 50 xtable(test) xtable(test) % latex table generated in R 2.15.2 by xtable 1.7-0 package % Sat Mar 16 08:34:34 2013 \begin{table}[ht] \begin{center} \begin{tabular}{rrr} \hline Vote A Vote B \\ \hline Gender A 50.00 50.00 \\ Gender B 50.00 50.00 \\ \hline \end{tabular} \end{center} \end{table} I suppose a similar thing will happen with latex saves detaching latex is a bit different in that it gives you multicolumn for the header columns which can be modified (I have not used latex) for justification I think the problem is in the arrangement of the data or the names that you are sending to latex() someone else may have a different opinion HTH Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mac...@northnet.com.au At 05:33 16/03/2013, you wrote: Hello: I'm working with a 2-dimensional table that looks sort of like test below. I'm trying to produce latex code that will add dimension names for both the rows and the columns. In using the following code, latex chokes when I include collabel='Vote' but it's fine without it. The code below prouces the latex code further below. I'm confused by this, because it looks like it's creating two bits of text for each instance of \multicolumn. Is that really allowed in \multicolumn? Could someone clarify? Thank you! Yours, SJK library(Hmisc) test-as.table(matrix(c(50,50,50,50), ncol=2)) latex(test, rowlabel='Gender',collabel='Vote', file='') % latex.default(test, rowlabel = Gender, collabel = vote, file = ) % \begin{table}[!tbp] \begin{center} \begin{tabular}{lrr} \hline\hline \multicolumn{1}{l}{Gender}\multicolumn{1}{vote}{A}\multicolumn{1}{l}{B}\tabularnewline \hline A$50$$50$\tabularnewline B$50$$50$\tabularnewline \hline \end{tabular} \end{center} \end{table} * Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 905 746 7606 Please avoid sending me Word, PowerPoint or Excel attachments. Sending these documents puts pressure on many people to use Microsoft software and helps to deny them any other choice. In effect, you become a buttress of the Microsoft monopoly. To convert to plain text choose Text Only or Text Document as the Save As Type. Your computer may also have a program to convert to PDF format. Select File, then Print. Scroll through available printers and select the PDF converter. Click on the Print button and enter a name for the PDF file when requested. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nlrob and robust nonlinear regression with upper and/or lower bounds on parameters
On 2013-03-15 07:57, Shane McMahon wrote: I have a question regarding robust nonlinear regression with nlrob. I would like to place lower bounds on the parameters, but when I call nlrob with limits it returns the following error: Error in psi(resid/Scale, ...) : unused argument(s) (lower = list(Asym = 1, mid = 1, scal = 1)) After consulting the documentation I noticed that upper and lower are not listed as parameter in the nlrob help documentation. I haven't checked the source to confirm this yet, but I infer that nlrob simply doesn't support upper and lower bounds. For my current problem, I only require that the parameters be positive, so I simply rewrote the formula to be a function of the absolute value of the parameter. However, I have other problems where I am not so lucky. Are there robust nonlinear regression methods that support upper and lower bounds? Or am I simply missing something with nlrob? I've included example code that should illustrate the issue. require(stats) require(robustbase) Dat - NULL; Dat$x - rep(1:25, 20) set.seed(1) Dat$y - SSlogis(Dat$x, 10, 12, 2)*rnorm(500, 1, 0.1) plot(Dat) Dat.nls - nls(y ~ SSlogis(x, Asym, mid, scal), data=Dat,start=list(Asym=1,mid=1,scal=1),lower=list(Asym=1,mid=1,scal=1)); Dat.nls lines(1:25, predict(Dat.nls, newdata=list(x=1:25)), col=1) Dat.nlrob - nlrob(y ~ SSlogis(x, Asym, mid, scal), data=Dat,start=list(Asym=1,mid=1,scal=1)); Dat.nlrob lines(1:25, predict(Dat.nlrob, newdata=list(x=1:25)), col=2) Dat.nlrob - nlrob(y ~ SSlogis(x, Asym, mid, scal), data=Dat,start=list(Asym=1,mid=1,scal=1),lower=list(Asym=1,mid=1,scal=1)); Dat.nlrob thanks, Shane I'm not sure what your example is supposed to illustrate, but the lower argument in nls() is being ignored. As ?nls says: 'Bounds can only be used with the port algorithm', which is not the default, and nls() does issue a warning with your code. If you want to force a coefficient to be positive, the usual approach is to estimate the logarithm of the coefficient by using the exp(log(coef)) construct. See argument 'lrc' in ?SSasymp for example. Introducing a shift to accommodate coef k for given k is simple. Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nlrob and robust nonlinear regression with upper and/or lower bounds on parameters
Forgot to mention: You might find the nlmrt package helpful but I have no experience with that (yet). Peter Ehlers On 2013-03-15 07:57, Shane McMahon wrote: I have a question regarding robust nonlinear regression with nlrob. I would like to place lower bounds on the parameters, but when I call nlrob with limits it returns the following error: Error in psi(resid/Scale, ...) : unused argument(s) (lower = list(Asym = 1, mid = 1, scal = 1)) After consulting the documentation I noticed that upper and lower are not listed as parameter in the nlrob help documentation. I haven't checked the source to confirm this yet, but I infer that nlrob simply doesn't support upper and lower bounds. For my current problem, I only require that the parameters be positive, so I simply rewrote the formula to be a function of the absolute value of the parameter. However, I have other problems where I am not so lucky. Are there robust nonlinear regression methods that support upper and lower bounds? Or am I simply missing something with nlrob? I've included example code that should illustrate the issue. require(stats) require(robustbase) Dat - NULL; Dat$x - rep(1:25, 20) set.seed(1) Dat$y - SSlogis(Dat$x, 10, 12, 2)*rnorm(500, 1, 0.1) plot(Dat) Dat.nls - nls(y ~ SSlogis(x, Asym, mid, scal), data=Dat,start=list(Asym=1,mid=1,scal=1),lower=list(Asym=1,mid=1,scal=1)); Dat.nls lines(1:25, predict(Dat.nls, newdata=list(x=1:25)), col=1) Dat.nlrob - nlrob(y ~ SSlogis(x, Asym, mid, scal), data=Dat,start=list(Asym=1,mid=1,scal=1)); Dat.nlrob lines(1:25, predict(Dat.nlrob, newdata=list(x=1:25)), col=2) Dat.nlrob - nlrob(y ~ SSlogis(x, Asym, mid, scal), data=Dat,start=list(Asym=1,mid=1,scal=1),lower=list(Asym=1,mid=1,scal=1)); Dat.nlrob thanks, Shane __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dispersion indicator for clustered data
Hi, I have a dataset with clustered data (observations within groups) and would like to make some descriptive plots. Now, I am a little bit lost on how to present the dispersion of the data (what kind of residuals to plot). I could compute the standard error of the mean (SEM) ignoring the clustering (very low values and misleading) or I could first aggregate the data by calculating th mean for each group and calculate the SEM for this means. But I am not so sure what implication these two approaches have. In the end, I take the clustering into account by fitting a random-intercept regression model – however for plotting I would like to have a descriptive dispersion indicator of the data. Now, I heard a lot about 'clustered' or 'robust' standard errors. Is there some kind of correction I can apply to the simple SEM formula (sd(x)/sqrt(m)) to take care of correlated observations within clusters? Or are there bootstrapping or jackknife approaches implemented in R (or cran package) which give me unbiased variance estimation for clustered data? thanks for any suggestions! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for a good tutorial on ff package
The documentation is very straightforward, I suggest you describe what you want to do in more detail and what you don't understand about the functions when you try to use them. You basically create an array with ?ff or data.frame with ?ffdf and proceed from there - each page has examples. All I've ever done is make big objects and populate them in chunks, which is really natural and easy for an array. I did have specific memory caching issues on Windows when the object exceeds available memory, but still that was straightforward to describe and ask about specifically. I've seen a lot of discussion on bigmemory being easier to understand, but I find that is far more limiting. Do a search on ff tutorial and explore the packages that rely on ff after working through the basic help pages in the package itself. The CRAN Task View High Performance Computing has an overview of the topic for a suite of packages (though it does not mention the tools in raster/rgdal, which are very good if you happen to use spatial grid data). Cheers, Mike. On Fri, Mar 15, 2013 at 11:18 PM, Fritz Zuhl r_listse...@zuhl.org wrote: Hi, I am looking for a good tutorial on the ff package. Any suggestions? Also, any other package would anyone recommend for dealing with data that extends beyond the RAM would be greatly appreciated. Thanks, Fritz Zuhl __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael Sumner Hobart, Australia e-mail: mdsum...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] identifying and drawing from T distribution
dear R experts: fitdistr suggests that a t with a mean of 1, an sd of 2, and 2.6 degrees of freedom is a good fit for my data. now I want to draw random samples from this distribution.should I draw from a uniform distribution and use the distribution function itself for the transform, or is there a better way to do this? there is a non-centrality parameter ncp in rt, but one parameter ncp cannot subsume two (m and s), of course. my first attempt was to draw rt(..., df=2.63)*s+m, but this was obviously not it. advice appreciated. /iaw Ivo Welch (ivo.we...@gmail.com) http://www.ivo-welch.info/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] identifying and drawing from T distribution
Hi Ivo, Try something like this: rt(1e5, df = 2.6, ncp = (1 - 0) * sqrt(2.6 + 1)/2) The NCP comes from the mean, N, and SD. See ?rt Cheers, Josh On Fri, Mar 15, 2013 at 6:58 PM, ivo welch ivo.we...@anderson.ucla.edu wrote: dear R experts: fitdistr suggests that a t with a mean of 1, an sd of 2, and 2.6 degrees of freedom is a good fit for my data. now I want to draw random samples from this distribution.should I draw from a uniform distribution and use the distribution function itself for the transform, or is there a better way to do this? there is a non-centrality parameter ncp in rt, but one parameter ncp cannot subsume two (m and s), of course. my first attempt was to draw rt(..., df=2.63)*s+m, but this was obviously not it. advice appreciated. /iaw Ivo Welch (ivo.we...@gmail.com) http://www.ivo-welch.info/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://joshuawiley.com/ Senior Analyst - Elkhart Group Ltd. http://elkhartgroup.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about nls
I decided to follow up my own suggestion and look at the robustness of nls vs. nlxb. NOTE: this problem is NOT one that nls() would usually be applied to. The script below is very crude, but does illustrate that nls() is unable to find a solution in 70% of tries where nlxb (a Marquardt approach) succeeds. I make no claim for elegance of this code -- very quick and dirty. JN debug-FALSE library(nlmrt) x-c(60,80,100,120) y-c(0.8,6.5,20.5,45.9) mydata-data.frame(x,y) mydata xmin-c(0,0,0) xmax-c(8,8,8) set.seed(123456) nrep - as.numeric(readline(Number of reps:)) pnames-c(a, b, d) npar-length(pnames) # set up structure to record results # need start, failnls, parnls, ssnls, failnlxb, parnlxb, ssnlxb tmp-matrix(NA, nrow=nrep, ncol=3*npar+4) outcome-as.data.frame(tmp) rm(tmp) colnames(outcome)-c(paste(st-,pnames[[1]],''), paste(st-,pnames[[2]],''), paste(st-,pnames[[3]],''), failnls, paste(nls-,pnames[[1]],''), paste(nls,pnames[[1]],''), paste(nls-,pnames[[1]],''), ssnls, failnlxb, paste(nlxb-,pnames[[1]],''), paste(nlxb,pnames[[1]],''), paste(nlxb-,pnames[[1]],''), ssnlxb) for (i in 1:nrep){ cat(Try ,i,:\n) st-runif(3) names(st)-pnames if (debug) print(st) rnls-try(nls(y ~ exp(a + b*x)+d,start=st, data=mydata), silent=TRUE) if (class(rnls) == try-error) { failnls-TRUE parnls-rep(NA,length(st)) ssnls-NA } else { failnls-FALSE ssnls-deviance(rnls) parnls-coef(rnls) } names(parnls)-pnames if (debug) { cat(nls():) print(rnls) } rnlxb-try(nlxb(y ~ exp(a + b*x)+d,start=st, data=mydata), silent=TRUE) if (class(rnlxb) == try-error) { failnxlb-TRUE parnlxb-rep(NA,length(st)) ssnlxb-NA } else { failnlxb-FALSE ssnlxb-rnlxb$ssquares parnlxb-rnlxb$coeffs } names(parnls)-pnames if (debug) { cat(nlxb():) print(rnlxb) tmp-readline() cat(\n) } solrow-c(st, failnls=failnls, parnls, ssnls=ssnls, failnlxb=failnlxb, parnlxb, ssnlxb=ssnlxb) outcome[i,]-solrow } # end loop cat(Proportion of nls runs that failed = ,sum(outcome$failnls)/nrep,\n) cat(Proportion of nlxb runs that failed = ,sum(outcome$failnlxb)/nrep,\n) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] new question
Hi, Try this: directory- /home/arunksa111/dados #modified the function GetFileList - function(directory,number){ setwd(directory) filelist1-dir() lista-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), full.names = TRUE, recursive = TRUE) output- list(filelist1,lista) return(output) } file.list.names-GetFileList(directory,23)[[1]] lista-GetFileList(directory,23)[[2]] FacGroup-c(0,1,0,2,2,0,3) ReadDir-function(FacGroup){ list.new-lista[FacGroup!=0] read.list-lapply(list.new, function(x) read.table(x,header=TRUE, sep = \t)) names(read.list)-file.list.names[FacGroup!=0] return (read.list) } ListFacGroup-ReadDir(FacGroup) z.boxplot- function(lst){ new.list- lapply(lst,function(x) x[x$FDR0.01,]) pdf(VeraBP.pdf) lapply(names(new.list),function(x) lapply(new.list[x],function(y) boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x))) dev.off() } z.boxplot(ListFacGroup) A.K. From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Friday, March 15, 2013 2:08 PM Subject: Re: new question Sorry, you could give me a small new help? Using the same data, I need a boxplot by groups. I write he the functions I'm using. The last (z.boxplot is what I need, the other is ok). Thank you one more time. GetFileList - function(directory,number){ setwd(directory) filelist1-dir()[file.info(dir())$isdir] direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), full.names = FALSE, recursive = TRUE) direct-lapply(direct,function(x) paste(directory,/,x,sep=)) lista-unlist(direct) output- list(filelist1,lista) return(output) } ReadDir-function(FacGroup){ list.new-lista[FacGroup!=0] read.list-lapply(list.new, function(x) read.table(x,header=TRUE, sep = \t)) names(read.list)-file.list.names[FacGroup!=0] return (read.list) } directory-C:/Users/Vera Costa/Desktop/dados.lixo file.list.names-GetFileList(directory,23) [[1]] lista-GetFileList(directory,23) [[2]] FacGroup-c(0,1,0,2,2,0,3) ListFacGroup-ReadDir(FacGroup) #zPValues(ListFacGroup,FacGroup) z.boxplot - function(lista) { #I need eliminate all data with FDR0.01 new.list-lista[FDR0.01] #boxplots split by groups boxplot(FDR ~ z, data = dct1, xlab = Charge, ylab = FDR,main=(paste(t,i))) } z.boxplot(ListFacGroup) 2013/3/13 Vera Costa veracosta...@gmail.com No problem! Sorry my questions. 2013/3/13 arun smartpink...@yahoo.com As I mentioned earlier, I don't find it useful to do anova on that kind of data. Previously, I tried with chisq.test also. It gave warnings() and then you responded that it is not correct. I would suggest you to dput an example dataset of the specific columns that you want to compare (possibly by row) and post in the R-help list. If you get any reply, then you can implement it on your whole list of files. Sorry, today, I am busy. From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Wednesday, March 13, 2013 9:43 AM Subject: Re: new question Ok. Thank you. Could you help me to apply this? 2013/3/13 arun smartpink...@yahoo.com you are comparing one datapoint to another. It doesn't make sense. For anova, you need replications to calculate df. may be you could try chisq.test. From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Wednesday, March 13, 2013 8:56 AM Subject: Re: new question I agree with you. I write this tests because I need to compare with some test. I agree is not very correct, but what is bioconductor?I need to eliminate some data (rows) not very significant based in some statistics. What about your idea? How can I do this? 2013/3/13 arun smartpink...@yahoo.com Ok. I need a t test (it's in this function). But I need a chisq.test corrected and a Anova with data in attach. What do you mean by this? Though, I calculated the t test based on comparing a single value against another for each row, I don't think it makes sense statistically. Here, you are estimating the mean by just one value, which then is the mean value and comparing it with another value. It doesn't make much sense. I think in bioconductor there are some packages which do this kind of comparison (I don't remember the names). Also, I am not sure what kind of inference you want from chisquare test. Also, from anova test (?using just 2 datapoints) (if the comparison is rowwise). From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Tuesday, March 12, 2013 6:04 PM Subject: Re: new question Ok. It isn't the last code... You sent me this code directory- /home/arunksa111/data.new #first function filelist-function(directory,number,list1){ setwd(directory) filelist1-dir(directory) direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), full.names = FALSE, recursive = TRUE)
Re: [R] missing values in an array
Thank you very much. Arun's reply is exactly what I need. Thank you once again!~ ray On Sat, Mar 16, 2013 at 12:31 AM, Berend Hasselman b...@xs4all.nl wrote: On 15-03-2013, at 17:08, Ray Cheung ray1...@gmail.com wrote: Dear All, I've an array with some missing values (NA) in between. I want to remove that particular matrix if a missing value is detected. How can I do so? Thank you very much. It is not clear what the dimension of your array is. If your array/matrix is two dimensional, then then any(is.na(A)) # A is the name of the array/matrix will return TRUE is at least one element of A is NA. And then you can delete A. If you array has three dimensions then you'll have to look at arun's solution. Berend [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.