Re: [R] Logical vectors
On Wed, 3 Nov 2010, Stephen Liu wrote: [snip] 2) x [1] 1 2 3 4 5 temp - x 1 temp [1] FALSE TRUE TRUE TRUE TRUE Why NOT temp [1] TRUE FALSE FALSE FALSE FALSE ? Maybe because of the definition of (greater (!) than)? Or do you expect 1 to be greater than 1 and not greater than 2, 3, 4, and 5? Regards -- Gerrit - AOR Dr. Gerrit Eichner Mathematical Institute, Room 212 gerrit.eich...@math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany Fax: +49-(0)641-99-32109http://www.uni-giessen.de/cms/eichner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical vectors
On Wed, Nov 3, 2010 at 10:50 PM, Stephen Liu sati...@yahoo.com wrote: Hi folks, Pls help me to understand follow; An Introduction to R 2.4 Logical vectors http://cran.r-project.org/doc/manuals/R-intro.html#R-and-statistics 1) x [1] 1 2 3 4 5 a vector, x, is defined with 5 elements, {1, 2, 3, 4, 5} temp - x != 1 perform the logical test that x does not equal 1 returning either TRUE or FALSE. 1 = 1 so TRUE, 2 != 1 so FALSE, etc. next we assign *the results* of the logical test to the vector 'temp' temp [1] FALSE TRUE TRUE TRUE TRUE print the vector to screen 2) x [1] 1 2 3 4 5 note that x has not changed here, we assigned to temp, not to x. temp - x 1 now we assign the results of the logical test, x 1 {1 = 1 so FALSE, 2 1 so TRUE, 3 1 so TRUE, 4 1 so TRUE, 5 1 so TRUE} we assign these results to a vector, 'temp'. This *new* assignment overwrites the old vector 'temp' temp [1] FALSE TRUE TRUE TRUE TRUE print temp to screen, this is the results of our second logical test (x 1). Why NOT temp [1] TRUE FALSE FALSE FALSE FALSE My best guess of where you got confused is that we assigned the results to 'temp', so 'x' remained unchanged {1, 2, 3, 4, 5}, or that you confused '-' which is the assignment operator in R, to less than negative... *OR* less than or equal. We could write this equivalently: 1:5 1 [1] FALSE TRUE TRUE TRUE TRUE this was the logical test, whose results were assigned to the vector, temp. assign(x = temp, value = 1:5 1) using the assign function (not often recommended) to avoid any confusion with the assignment operator, -. temp [1] FALSE TRUE TRUE TRUE TRUE print to screen HTH, Josh ? TIA B.R. Stephen L __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with RODBC installation
Good morning, I have some problems installing RODBC to R in a linux cluster. My R version is: R version 2.12.0 (2010-10-15) Platform: x86_64-unknown-linux-gnu (64-bit) I get the following error: install.packages('RODBC') Installing package(s) into '/home/jorgehou/R/x86_64-unknown-linux-gnu-library/2.12' (as 'lib' is unspecified) trying URL 'http://stat.ethz.ch/CRAN/src/contrib/RODBC_1.3-2.tar.gz' Content type 'application/x-gzip' length 1108358 bytes (1.1 Mb) opened URL == downloaded 1.1 Mb * installing *source* package 'RODBC' ... checking for gcc... gcc -std=gnu99 checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc -std=gnu99 accepts -g... yes checking for gcc -std=gnu99 option to accept ANSI C... none needed checking how to run the C preprocessor... gcc -std=gnu99 -E checking for egrep... grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking sql.h usability... no checking sql.h presence... no checking for sql.h... no checking sqlext.h usability... no checking sqlext.h presence... no checking for sqlext.h... no configure: error: ODBC headers sql.h and sqlext.h not found ERROR: configuration failed for package 'RODBC' * removing '/home/jorgehou/R/x86_64-unknown-linux-gnu-library/2.12/RODBC' The downloaded packages are in '/tmp/Rtmpgb1Nxz/downloaded_packages' Warning message: In install.packages(RODBC) : installation of package 'RODBC' had non-zero exit status I found some info on it here: http://r.789695.n4.nabble.com/Problem-installing-RODBC-td2016736.html but how should I use it??? (Yes, I am very novice to Linux (and R) so it might be a stupid question) Thanks! Jørgen -- Jørgen Blystad Houge MSc Student Industrial Economics NTNU, Norway [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with RODBC installation
Please read the RODBC manual (which comes with it). You (or the cluster owner) need to install unixODBC, and if installing from RPMs etc, something like unixODBC-devel. Please also note the R posting guide - no HTML mail, use an appropriate list (R-sig-db or R-devel here as this is about non-R programming). On Thu, 4 Nov 2010, Jørgen Blystad Houge wrote: Good morning, I have some problems installing RODBC to R in a linux cluster. My R version is: R version 2.12.0 (2010-10-15) Platform: x86_64-unknown-linux-gnu (64-bit) I get the following error: install.packages('RODBC') Installing package(s) into '/home/jorgehou/R/x86_64-unknown-linux-gnu-library/2.12' (as 'lib' is unspecified) trying URL 'http://stat.ethz.ch/CRAN/src/contrib/RODBC_1.3-2.tar.gz' Content type 'application/x-gzip' length 1108358 bytes (1.1 Mb) opened URL == downloaded 1.1 Mb * installing *source* package 'RODBC' ... checking for gcc... gcc -std=gnu99 checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc -std=gnu99 accepts -g... yes checking for gcc -std=gnu99 option to accept ANSI C... none needed checking how to run the C preprocessor... gcc -std=gnu99 -E checking for egrep... grep -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking sql.h usability... no checking sql.h presence... no checking for sql.h... no checking sqlext.h usability... no checking sqlext.h presence... no checking for sqlext.h... no configure: error: ODBC headers sql.h and sqlext.h not found ERROR: configuration failed for package 'RODBC' * removing '/home/jorgehou/R/x86_64-unknown-linux-gnu-library/2.12/RODBC' The downloaded packages are in '/tmp/Rtmpgb1Nxz/downloaded_packages' Warning message: In install.packages(RODBC) : installation of package 'RODBC' had non-zero exit status I found some info on it here: http://r.789695.n4.nabble.com/Problem-installing-RODBC-td2016736.html but how should I use it??? (Yes, I am very novice to Linux (and R) so it might be a stupid question) Thanks! J?rgen -- J?rgen Blystad Houge MSc Student Industrial Economics NTNU, Norway [[alternative HTML version deleted]] -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with RODBC installation
On 11/04/2010 08:14 AM, Jørgen Blystad Houge wrote: ... '/tmp/Rtmpgb1Nxz/downloaded_packages' Warning message: In install.packages(RODBC) : installation of package 'RODBC' had non-zero exit status I found some info on it here: http://r.789695.n4.nabble.com/Problem-installing-RODBC-td2016736.html but how should I use it??? (Yes, I am very novice to Linux (and R) so it might be a stupid question) Well, the answer is in the output: configure: error: ODBC headers sql.h and sqlext.h not found That's an installation issue with ODBC, not an R issue as such. Usually a development package is missing, but exactly which one depends on your particular flavour of Linux. In Fedora 13, it is here: $ rpm -qf /usr/include/sqlext.h unixODBC-devel-2.2.14-12.fc13.i686 so the unixODBC-devel package is required. In e.g. Ubuntu, it is -er- somewhere else... -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to do bootstrap for the complex sample design?
Hello; Our survey is structured as : To be investigated area is divided into 6 regions, within each region, one urban community and one rural community are randomly selected, then samples are randomly drawn from each selected uran and rural community. The problems is that in urban/rural stratum, we only have one sample. In this case, how to do bootstrap? Any comments or hints are greatly appreciated! Faye [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical vectors
Hi Gerrit, Thanks for your advice. In; 2.4 Logical vectors http://cran.r-project.org/doc/manuals/R-intro.html#R-and-statistics It states:- The logical operators are , =, , =, == for exact equality and != for inequality # exact equality != # inequality I did follows; x - 1:5 x [1] 1 2 3 4 5 temp - x != 1 temp [1] FALSE TRUE TRUE TRUE TRUE That is correct. rm(temp) temp - x 1 temp [1] FALSE TRUE TRUE TRUE TRUE That seems not correct. My understanding is; [1] TRUE FALSE FALSE FALSE FALSE B.R. Stephen L - Original Message From: Gerrit Eichner gerrit.eich...@math.uni-giessen.de To: Stephen Liu sati...@yahoo.com Cc: r-help@r-project.org Sent: Thu, November 4, 2010 2:34:55 PM Subject: Re: [R] Logical vectors On Wed, 3 Nov 2010, Stephen Liu wrote: [snip] 2) x [1] 1 2 3 4 5 temp - x 1 temp [1] FALSE TRUE TRUE TRUE TRUE Why NOT temp [1] TRUE FALSE FALSE FALSE FALSE ? Maybe because of the definition of (greater (!) than)? Or do you expect 1 to be greater than 1 and not greater than 2, 3, 4, and 5? Regards -- Gerrit - AOR Dr. Gerrit Eichner Mathematical Institute, Room 212 gerrit.eich...@math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany Fax: +49-(0)641-99-32109http://www.uni-giessen.de/cms/eichner - __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical vectors
Hi Joshua, Thanks for your advice. assign(x = temp, value = 1:5 1) using the assign function (not often recommended) to avoid any confusion with the assignment operator, -. temp [1] FALSE TRUE TRUE TRUE TRUE I got it. Thanks B.R. Stephen L - Original Message From: Joshua Wiley jwiley.ps...@gmail.com To: Stephen Liu sati...@yahoo.com Cc: r-help@r-project.org Sent: Thu, November 4, 2010 2:46:15 PM Subject: Re: [R] Logical vectors On Wed, Nov 3, 2010 at 10:50 PM, Stephen Liu sati...@yahoo.com wrote: Hi folks, Pls help me to understand follow; An Introduction to R 2.4 Logical vectors http://cran.r-project.org/doc/manuals/R-intro.html#R-and-statistics 1) x [1] 1 2 3 4 5 a vector, x, is defined with 5 elements, {1, 2, 3, 4, 5} temp - x != 1 perform the logical test that x does not equal 1 returning either TRUE or FALSE. 1 = 1 so TRUE, 2 != 1 so FALSE, etc. next we assign *the results* of the logical test to the vector 'temp' temp [1] FALSE TRUE TRUE TRUE TRUE print the vector to screen 2) x [1] 1 2 3 4 5 note that x has not changed here, we assigned to temp, not to x. temp - x 1 now we assign the results of the logical test, x 1 {1 = 1 so FALSE, 2 1 so TRUE, 3 1 so TRUE, 4 1 so TRUE, 5 1 so TRUE} we assign these results to a vector, 'temp'. This *new* assignment overwrites the old vector 'temp' temp [1] FALSE TRUE TRUE TRUE TRUE print temp to screen, this is the results of our second logical test (x 1). Why NOT temp [1] TRUE FALSE FALSE FALSE FALSE My best guess of where you got confused is that we assigned the results to 'temp', so 'x' remained unchanged {1, 2, 3, 4, 5}, or that you confused '-' which is the assignment operator in R, to less than negative... *OR* less than or equal. We could write this equivalently: 1:5 1 [1] FALSE TRUE TRUE TRUE TRUE this was the logical test, whose results were assigned to the vector, temp. assign(x = temp, value = 1:5 1) using the assign function (not often recommended) to avoid any confusion with the assignment operator, -. temp [1] FALSE TRUE TRUE TRUE TRUE print to screen HTH, Josh ? TIA B.R. Stephen L __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best Fit line trouble with rsruby
On Wed, 3 Nov 2010 18:24:43 -0700 (PDT), Deadpool deadpoo...@comcast.net wrote: Hello, I am using R, through rsruby, to create a graph and best fit line for a set of data points, regarding data collected in a Chemistry class. The problem is that although the graph functions perfectly properly, the best fit line will not work. I initially used code I pretty much copied from a website with a tutorial on this, which was: graphData.png(/code/Beer's-Law Graph.png) concentration = p1Conc absorbance = p1AbsorbanceArray graphData.assign('x', p1Conc) graphData.assign('y', p1AbsorbanceArray) fit = graphData.lm('x ~ y') graphData.plot(concentration, absorbance) graphData.abline(fit[coefficients][(Intercept)], fit[coefficients][y]) puts fit[coefficients] graphData.eval_R(dev.off()) (p1Conc and p1AbsorbanceArray are arrays) This worked for the graph, but the best fit line looked (and the infinitesimally small slope supported) like it was based off a single point. The site said they had to define something in the R interpreter first, but didn't elaborate, so I gave it a go, and obviously it didn't work. It looks to me like you have the response and explanatory variables swapped in your model (or your plot). Try: fit = graphData.lm(y~x) graphData.plot(concentration, absorbance) graphData.abline(fit[coefficients][(Intercept)],fit[coefficients][x]) Or just swap the axes on your plot. I then tried something like this, as I thought the conversion from the array to the string in the assign function was causing the problem with the best fit line. No - that should be fine. You aren't converting an array into a string just assigning a Ruby Array to an R variable (vector) with the given name. graphData = RSRuby.instance graphData.png(/code/Beer's-Law Graph.png) concentration = graphData.c(p1Conc[0..(p1SampNum - 1)]) absorbance = graphData.c(p1AbsorbanceArray[0..(p1SampNum - 1)]) fit = graphData.lm(concentration ~ absorbance) graphData.plot(concentration, absorbance) graphData.abline(fit[coefficients][(Intercept)], fit[coefficients][absorbance]) puts fit[coefficients] print \n graphData.eval_R(dev.off()) Basically trying to bypass that, and feed the numbers straight from the array into the best fit line, but the program was giving me an error, saying it didn't know what ~ was for an array (should note I tried it first without doing the graphData.c thing, but that didn't work and as the .c function didn't seem to store things as an array, I thought that might work, it didn't, as it does store data as an array). RSRuby doesn't know about R formulas so a bare '~' is a syntax error in Ruby. You must pass the model specification as a string as you did the first time. Unfortunately this means you either have to do the .assign() workaround to get the data into variables R can see or pass the data via the 'data' argument to lm. See this irb session for an example of the second technique: wsp00614206:~ GUTTEA$ irb require 'rsruby' = true r = RSRuby.instance = #RSRuby:0x101176c20 @class_table={}, @default_mode=-1, @caching=true, @cache={get=#RObj:0x101176798, helpfun=#RObj:0x101172cd8, help=#RObj:0x101172cd8, NaN=NaN, FALSE=false, TRUE=true, F=false, NA=-2147483648, eval=#RObj:0x101175230, T=true, parse=#RObj:0x1011757a8}, @proc_table={} x = (1..10).to_a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] y = (11..20).to_a = [11, 12, 13, 14, 15, 16, 17, 18, 19, 20] fit = r.lm(y~x,:data={'x' = x, 'y' = y}) = {model={x=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], y=[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]}, qr={qr=[[-3.16227766016838, -17.3925271309261], [0.316227766016838, 9.08295106229247], [0.316227766016838, 0.15621147358221], [0.316227766016838, 0.0461150970695743], [0.316227766016838, -0.0639812794430617], [0.316227766016838, -0.174077655955698], [0.316227766016838, -0.284174032468334], [0.316227766016838, -0.39427040898097], [0.316227766016838, -0.504366785493606], [0.316227766016838, -0.614463162006242]], pivot=[1, 2], rank=2, tol=1.0e-07, qraux=[1.31622776601684, 1.26630785009485]}, assign=[0, 1], rank=2, residuals={6=3.24300739408472e-16, 7=3.20784180753863e-16, 8=-3.4886619267584e-16, 9=-1.01851656610554e-15, 1=-3.63520003547369e-15, 2=1.72099416959944e-15, 3=1.22302883507243e-15, 10=-3.55899309985059e-16, 4=9.97467671492785e-16, 5=7.71906507913144e-16}, df.residual=8, effects={=1.77635683940025e-15, x=9.08295106229247, (Intercept)=-49.0153037326099}, xlevels={}, fitted.values={6=16.0, 7=17.0, 8=18.0, 9=19.0, 1=11.0, 2=12.0, 3=13.0, 10=20.0, 4=14.0, 5=15.0}, call=#RObj:0x10111c810, terms=#RObj:0x10111c798, coefficients={x=1.0, (Intercept)=10.0}} fit[coefficients] = {x=1.0, (Intercept)=10.0} So basically I'm stuck. Not sure if anyone has any experience with rsruby, but any help would be appreciated. I'm pretty sure the fit = graphData.lm(etcetera) line is where the trouble is, but not sure how to handle it. You got pretty close! -- Alex Gutteridge
Re: [R] Logical vectors
On Thu, 4 Nov 2010, Stephen Liu wrote: [snip] In; 2.4 Logical vectors http://cran.r-project.org/doc/manuals/R-intro.html#R-and-statistics It states:- The logical operators are , =, , =, == for exact equality and != for inequality # exact equality != # inequality [snip] Hello, Stephen, in my understanding of the sentence The logical operators are , =, , =, == for exact equality and != for inequality the phrase exact equality refers to the operator ==, i. e. to the last element == in the enumeration (, =, , =, ==), and not to its first. Regards -- Gerrit - AOR Dr. Gerrit Eichner Mathematical Institute, Room 212 gerrit.eich...@math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany Fax: +49-(0)641-99-32109http://www.uni-giessen.de/cms/eichner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading in irregular, daily time series data
Hello, I am trying to read in some time series data and am having trouble. Forgive me, I am quite new to R. The data is in the form of: TS.1 2000-07-28 1419.89 2000-07-31 1430.83 2000-08-01 1438.1 2000-08-02 1438.7 2000-08-03 1452.56 2000-08-04 1462.93 2000-08-07 1479.32 2000-08-08 1482.8 2000-08-09 1472.87 2000-08-10 1460.25 2000-08-11 1471.84 2000-08-14 1491.56 2000-08-15 1484.43 ... The data is daily data, but it is irregular (i.e. there are some missing data points). I have tried reading in the data like so, but when I plot, the time series does not preserve the dates. sp500 - ts(read.table(SP500.txt,header=TRUE)) plot.ts(sp500) I have searched the forums, and have found the current method to work: sp500 - read.table(SP500.txt,header=TRUE) date - sp500[,1] data - sp500[,2] z - aggregate(zoo(data), as.Date(date), tail, 1) merge(z, zoo(, as.Date(unclass(time(as.ts(z))), fill=0))) plot.zoo(z) However, now I cannot analyze the acf of the time series, as R complains giving an error: Error in na.fail.default(as.ts(x)) : missing values in object Is there any way to fix this? I would actually prefer not using the zoo class, as it breaks the acf function in R. Would the new zoo object be compatible with all the ts functions? Thank you! -- View this message in context: http://r.789695.n4.nabble.com/Reading-in-irregular-daily-time-series-data-tp3026688p3026688.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical vectors
On 04-Nov-10 08:56:42, Gerrit Eichner wrote: On Thu, 4 Nov 2010, Stephen Liu wrote: [snip] In; 2.4 Logical vectors http://cran.r-project.org/doc/manuals/R-intro.html#R-and-statistics It states:- The logical operators are , =, , =, == for exact equality and != for inequality # exact equality != # inequality [snip] Hello, Stephen, in my understanding of the sentence The logical operators are , =, , =, == for exact equality and != for inequality the phrase exact equality refers to the operator ==, i. e. to the last element == in the enumeration (, =, , =, ==), and not to its first. Regards -- Gerrit This indicates that the sentence can be mis-read. It should be cured by a small change in punctuation (hence I copy to R-devel): The logical operators are , =, , =; == for exact equality; and != for inequality Hoping this helps! Ted. E-Mail: (Ted Harding) ted.hard...@wlandres.net Fax-to-email: +44 (0)870 094 0861 Date: 04-Nov-10 Time: 09:08:37 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical vectors
H Gerrit, the phrase exact equality refers to the operator ==, i. e. to the last element == in the enumeration (, =, , =, ==), and not to its first. x - 1:5 x [1] 1 2 3 4 5 temp -x == 1 temp [1] TRUE FALSE FALSE FALSE FALSE I got it thanks. B.R. Stephen L - Original Message From: Gerrit Eichner gerrit.eich...@math.uni-giessen.de To: Stephen Liu sati...@yahoo.com Cc: r-help@r-project.org Sent: Thu, November 4, 2010 4:56:42 PM Subject: Re: [R] Logical vectors On Thu, 4 Nov 2010, Stephen Liu wrote: [snip] In; 2.4 Logical vectors http://cran.r-project.org/doc/manuals/R-intro.html#R-and-statistics It states:- The logical operators are , =, , =, == for exact equality and != for inequality # exact equality != # inequality [snip] Hello, Stephen, in my understanding of the sentence The logical operators are , =, , =, == for exact equality and != for inequality the phrase exact equality refers to the operator ==, i. e. to the last element == in the enumeration (, =, , =, ==), and not to its first. Regards -- Gerrit - AOR Dr. Gerrit Eichner Mathematical Institute, Room 212 gerrit.eich...@math.uni-giessen.de Justus-Liebig-University Giessen Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany Fax: +49-(0)641-99-32109http://www.uni-giessen.de/cms/eichner - __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 3D Elliptic Fourier
dear all, Is it possible to do some 3D Elliptic Fourier analysis with R? I've read Morphometrics with R (Use R). There are several functions in the book like efourier, iefourier, and NEF to deal with the 2D closed outlines. Does anybody have any idea about how to deal with 3D outlines with R? Are there any known package, function, or publications about this? Thanks a lot! Yingqi Yingqi ZHANG Beijing P.O. Box 643, China 100044 Institute of Vertebrate Paleontology and Paleoanthropology (IVPP) Chinese Academy of Sciences Tel: +86-10-88369378 Fax: +86-10-68337001 Email: arvico...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Suppressing (or changing) the colours when using spatstat plot.quadratcounts
Hi, Using the quadrats and quadratcount functions in spatstat, when I go to plot either of these, I get the quadrats coloured by their identity, i.e., using a color ramp applied to the sequence of quadrats. This only happens when the quadrats are applied to an owin which is polygonal, i.e., when I have an irregularly shaped study area. There doesn't seem to be any obvious way to over-ride this behaviour that I can find. I would ideally like to be able to colour the quadrats by the number of points/events each contains, rather than by their identity. Anyone run into this? It seems like an obvious request, maybe I just haven't spotted the necessary option, although I think maybe not - it's something to do with how plot.tess() works. Thanks David -- David O'Sullivan Associate Professor of Geography University of Auckland | Te Whare Wananga o Tamaki Makaurau http://www.sges.auckland.ac.nz/the_school/our_people/osullivan_david/index.shtm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] density() function: differences with S-PLUS
Dear William, I obtained the same x values also without the from= and to= argument, using bw instead width in R. At this point I try to use a two step procedure for the y: - in the first step I obtained the x as below, - in the second step I used the minimum and the maximum values for the x as from= and to= arguments. In this way I obtain, in R, y values close to the S+ ones, but not the same. R code and S+ code and output are below. Thanks again. Nicola # R CODE exdata = iris$Sepal.Length[iris$Species == setosa] density(exdata, bw = 4, n = 50, cut = 0.75)$x # SAME AS S+ density(exdata, bw = 4, n = 50, cut = 0.75)$y # COMPLETELY DIFFERENT density(exdata, width = 4, n = 50, from = 1.3, to = 8.8, cut = 0.75)$y # CLOSE TO S+ # SPLUS CODE AND OUTPUT exdata = iris[, 1, 1] density(exdata, width = 4) $x: [1] 1.30 1.453061 1.606122 1.759184 1.912245 2.065306 [7] 2.218367 2.371429 2.524490 2.677551 2.830612 2.983673 [13] 3.136735 3.289796 3.442857 3.595918 3.748980 3.902041 [19] 4.055102 4.208163 4.361224 4.514286 4.667347 4.820408 [25] 4.973469 5.126531 5.279592 5.432653 5.585714 5.738776 [31] 5.891837 6.044898 6.197959 6.351020 6.504082 6.657143 [37] 6.810204 6.963265 7.116327 7.269388 7.422449 7.575510 [43] 7.728571 7.881633 8.034694 8.187755 8.340816 8.493878 [49] 8.646939 8.80 $y: [1] 0.0007849649 0.0013097474 0.0021225491 0.0033616520 [5] 0.0052059615 0.0078856717 0.0116917555 0.0169685132 [9] 0.0241073754 0.0335286785 0.0456521053 0.0608554862 [13] 0.0794235072 0.1014901241 0.1269807991 0.1555625999 [17] 0.1866111931 0.2192033788 0.2521417640 0.2840144993 [21] 0.3132881074 0.3384260582 0.3580208688 0.3709241384 [25] 0.3763578665 0.3739920600 0.3639778683 0.3469316232 [29] 0.3238721233 0.2961200278 0.2651731505 0.2325739601 [33] 0.1997853985 0.1680884651 0.1385105802 0.1117884914 [37] 0.0883644110 0.0684099972 0.0518702141 0.0385181792 [41] 0.0280126487 0.0199513951 0.0139159044 0.0095050745 [45] 0.0063575653 0.0041639082 0.0026680819 0.0016700727 [49] 0.0010169912 0.0005962089 2010/11/3 William Dunlap wdun...@tibco.com Did you get my reply (1:31pm PST Tuesday) to your request? It showed how you needed to use the from= and to= argument to density to get identical x components to the output and that the small differences in the y component were due to S+ truncating the gaussian kernel at +- 4 standard deviations from the center while R does not truncate the gaussian kernel (it output looks like it uses a Fourier transform to do the convolution). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Nicola Sturaro Sommacal (Quantide srl) Sent: Wednesday, November 03, 2010 3:34 AM To: Joshua Wiley Cc: r-help@r-project.org Subject: Re: [R] density() function: differences with S-PLUS Dear Joshua, first of all, thank you very much for reply. I hoped that someone who's familiar with both S+ and R can reply to me, because I spent some hours to looking for a solution. If someone else would try, this is the SPLUS code and output, while below there is the R code. I obtain the same x values, while y values are differents for both examples. Thank you very much. Nicola ### S-PLUS CODE AND OUTPUT ### density(1:1000, width = 4) $x: [1]-2.018.5102039.0204159.5306180.04082 100.55102 121.06122 [8] 141.57143 162.08163 182.59184 203.10204 223.61224 244.12245 264.63265 [15] 285.14286 305.65306 326.16327 346.67347 367.18367 387.69388 408.20408 [22] 428.71429 449.22449 469.73469 490.24490 510.75510 531.26531 551.77551 [29] 572.28571 592.79592 613.30612 633.81633 654.32653 674.83673 695.34694 [36] 715.85714 736.36735 756.87755 777.38776 797.89796 818.40816 838.91837 [43] 859.42857 879.93878 900.44898 920.95918 941.46939 961.97959 982.48980 [50] 1003.0 $y: [1] 4.565970e-006 1.31e-003 9.999374e-004 1.31e-003 9.999471e-004 1.31e-003 [7] 9.999560e-004 1.30e-003 9.999643e-004 1.29e-003 9.999718e-004 1.28e-003 [13] 9.999788e-004 1.26e-003 9.999852e-004 1.24e-003 9.10e-004 1.22e-003 [19] 9.63e-004 1.19e-003 1.01e-003 1.16e-003 1.06e-003 1.13e-003 [25] 1.10e-003 1.10e-003 1.13e-003 1.06e-003 1.16e-003 1.01e-003 [31] 1.19e-003 9.63e-004 1.22e-003 9.10e-004 1.24e-003 9.999852e-004 [37] 1.26e-003 9.999788e-004 1.28e-003 9.999718e-004 1.29e-003 9.999643e-004 [43] 1.30e-003 9.999560e-004 1.31e-003 9.999471e-004 1.31e-003 9.999374e-004 [49] 1.31e-003 4.432131e-006 exdata = iris[, 1, 1] density(exdata, width = 4) $x: [1] 1.30 1.453061 1.606122 1.759184 1.912245 2.065306 2.218367
[R] postForm() in RCurl and library RHTMLForms
Hi RUsers, Suppose I want to see the data on the website url - http://www.nseindia.com/content/indices/ind_histvalues.htm; for the index SP CNX NIFTY for dates FromDate=01-11-2010,ToDate=02-11-2010 then read the html table from the page using readHTMLtable() I am using this code webpage - postForm(url,.params=list( FromDate=01-11-2010, ToDate=02-11-2010, IndexType=SP CNX NIFTY, Indicesdata=Get Details), .opts=list(useragent = getOption(HTTPUserAgent))) But it doesn't give me desired result Also I was trying to use the function getHTMLFormDescription from the package RHTMLForms but there we can't use the argument .opts=list(useragent = getOption(HTTPUserAgent)) which is needed for this particular website Thanks and Regards Sayan Dasgupta [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Logical vectors
Hi Ted, Thanks for your advice and the correction on the document concerned. B.R. Stephen L - Original Message From: ted.hard...@wlandres.net ted.hard...@wlandres.net To: r-help@r-project.org Cc: Stephen Liu sati...@yahoo.com; R-Devel r-de...@stat.math.ethz.ch Sent: Thu, November 4, 2010 5:08:42 PM Subject: Re: [R] Logical vectors On 04-Nov-10 08:56:42, Gerrit Eichner wrote: On Thu, 4 Nov 2010, Stephen Liu wrote: [snip] In; 2.4 Logical vectors http://cran.r-project.org/doc/manuals/R-intro.html#R-and-statistics It states:- The logical operators are , =, , =, == for exact equality and != for inequality # exact equality != # inequality [snip] Hello, Stephen, in my understanding of the sentence The logical operators are , =, , =, == for exact equality and != for inequality the phrase exact equality refers to the operator ==, i. e. to the last element == in the enumeration (, =, , =, ==), and not to its first. Regards -- Gerrit This indicates that the sentence can be mis-read. It should be cured by a small change in punctuation (hence I copy to R-devel): The logical operators are , =, , =; == for exact equality; and != for inequality Hoping this helps! Ted. E-Mail: (Ted Harding) ted.hard...@wlandres.net Fax-to-email: +44 (0)870 094 0861 Date: 04-Nov-10 Time: 09:08:37 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Importing triple-s (standard survey structure) files to R
Dear R mailing list I am about to start working on a project for a market research customer. Their survey data is in the triple-s format, and I need to import this into R. The triple-s (standard survey structure) format is an open format for the exchange of survey data. It consists of two files, both in plain text format. - A text data file (.asc) - A metadata (.sss) file that describes the survey and data structure According the triple-s website this format is supported by a long list of survey and statistical software, including SPSS. However, a search of Google, R-Site and the R mailing list archives reveals nothing at all. I would be interested to know if anyone has prototype code for importing triple-s. If nothing exists, I plan to write and contribute a package to do this, a starting point would be very helpful. The current specification of triple-s support an XML format, so the way forward is probably to re-use the XML package code and build upon this. More information at the triple-s website: http://www.triple-s.org/oft.htm Regards Andrie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Closing unreferenced result sets in dbi / RSQLite
Hello R-help members, I have one problem with the database interface dbi (more specifically, I work with RSQLite). Consider the following example, which writes a test table to a temporary SQLite database and sends a query to read from it: library(RSQLite) df - as.data.frame(matrix(runif(4), nrow=2, ncol=2)) drv - dbDriver(SQLite) con - dbConnect(drv) dbWriteTable(con, df, df) dbSendQuery(con, select * from df) In the last line I forgot to assign the DBIResult object returned by dbSendQuery() to a variable, which happens from time to time when I work interactively. The following attempt to correct the mistake: res - dbSendQuery(con, select * from df) fails because the orphaned result set from the preceeding call is still active. Consequently, I have to close the connection to keep on working, which is especially annoying when working with a temporary data base where everything is discarded on disconnection. Is there any way to create a new reference to the pending result set or to close result sets which are not bound to a variable? Thanks for any suggestion, Andreas -- Andreas Borg Medizinische Informatik UNIVERSITÄTSMEDIZIN der Johannes Gutenberg-Universität Institut für Medizinische Biometrie, Epidemiologie und Informatik Obere Zahlbacher Straße 69, 55131 Mainz www.imbei.uni-mainz.de Telefon +49 (0) 6131 175062 E-Mail: b...@imbei.uni-mainz.de Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und löschen Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail und der darin enthaltenen Informationen ist nicht gestattet. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Closing unreferenced result sets in dbi / RSQLite
Hi Andreas, Try this... # forget to assign result set dbSendQuery(con, select * from df) # retrieve the result set just created rs - dbListResults(con)[[1]] Then you can do dbClearResult or whatever. Michael On 4 November 2010 19:56, Andreas Borg andreas.b...@unimedizin-mainz.de wrote: Hello R-help members, I have one problem with the database interface dbi (more specifically, I work with RSQLite). Consider the following example, which writes a test table to a temporary SQLite database and sends a query to read from it: library(RSQLite) df - as.data.frame(matrix(runif(4), nrow=2, ncol=2)) drv - dbDriver(SQLite) con - dbConnect(drv) dbWriteTable(con, df, df) dbSendQuery(con, select * from df) In the last line I forgot to assign the DBIResult object returned by dbSendQuery() to a variable, which happens from time to time when I work interactively. The following attempt to correct the mistake: res - dbSendQuery(con, select * from df) fails because the orphaned result set from the preceeding call is still active. Consequently, I have to close the connection to keep on working, which is especially annoying when working with a temporary data base where everything is discarded on disconnection. Is there any way to create a new reference to the pending result set or to close result sets which are not bound to a variable? Thanks for any suggestion, Andreas -- Andreas Borg Medizinische Informatik UNIVERSITÄTSMEDIZIN der Johannes Gutenberg-Universität Institut für Medizinische Biometrie, Epidemiologie und Informatik Obere Zahlbacher Straße 69, 55131 Mainz www.imbei.uni-mainz.de Telefon +49 (0) 6131 175062 E-Mail: b...@imbei.uni-mainz.de Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und löschen Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail und der darin enthaltenen Informationen ist nicht gestattet. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Orthogonalization with different inner products
See gsorth() in the heplots package. On 11/3/2010 1:55 PM, adet...@uw.edu wrote: Suppose one wanted to consider random variables X_1,...X_n and from each subtract off the piece which is correlated with the previous variables in the list. i.e. make new variables Z_i so that Z_1=X_1 and Z_i=X_i-cov(X_i,Z_1)Z_1/var(Z_1)-...- cov(X_i,Z__{i-1})Z__{i-1}/var(Z_{i-1}) I have code to do this but I keep getting a non-conformable array error in the line with the covariance. Does anyone have any suggestions? Here is my code: gov=read.table(file.choose(), sep=\t,header=T) gov1=gov[3:length(gov[1,])] n_indices=length(names(gov1)) x=data.matrix(gov1) v=x R=matrix(rep(0,length(x[,1])*length(x[1,])),length(x[,1])) for(j in 1:n_indices){ u=matrix(rep(0,length(v[,1])),length(v[,1])) for(i in 1:j-1){ u = u+cov(v[,j],v[,i])*v[,i]/var(v[,i])#(error here) } v[,j]=v[,j]-u } Thanks, Andrew [[alternative HTML version deleted]] -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele StreetWeb: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cross-validation for choosing regression trees
Forgive me if I misunderstand your goals but I have no idea what you are trying to determine or what your data is. I can say, however, that setting mindev to 0 has always overfit data for me, and that you are more than likely looking at a situation in which that 1 node tree is more accurate. Also, if you look at ?cv.tree, the default function to use is prune.tree(). Perhaps prune.tree() is trimming down to that terminal node? If you want an alternative look at CART methods that may account for some of your issues, I would recommend the packages 'rpart' and 'party', as they may be more informative. -- Jonathan P. Daily Technician - USGS Leetown Science Center 11649 Leetown Road Kearneysville WV, 25430 (304) 724-4480 Is the room still a room when its empty? Does the room, the thing itself have purpose? Or do we, what's the word... imbue it. - Jubal Early, Firefly From: Shiyao Liu lsy...@iastate.edu To: r-help@r-project.org Date: 11/03/2010 09:04 PM Subject: [R] cross-validation for choosing regression trees Sent by: r-help-boun...@r-project.org Dear All, We came across a problem when using the tree package to analyze our data set. First, in the tree function, if we use the default value mindev=0.01, the resulting regression tree has a single node. So, we set mindev=0, and obtain a tree with 931 terminal nodes. However, when we further use the cv.tree function to run a 10-fold cross-validation, the error message is: Error in prune.tree(list(frame = list(var = 1L, n = 6676, dev = 3.28220789569792, : can not prune singlenode tree. Is the cv.tree function respecting the mindev chosen in the tree function or what else might be wrong? Thanks, Shiyao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sorting data from one column with strings
Hello, I have tried to find this out some other way, but unsuccessful I have to try this list. I assume this should be quite simple. I have a dataset with 4 columns, Sample_no, Species, Nitrogen, Carbon in csv format. In the species column I have many different species with varying number of obs per species Eg Sample_no Species Nitrogen Carbon 1 Cod 15.2-19.0 2 Haddock 14.8-20.2 3 Cod 15.6-18.5 4 Cod 13.2-20.1 5 Haddock 14.3-18.8 Etc.. And I want to calculate, mean, standard dev etc per species for the observations Nitrogen and Carbon. And later do plots and stats with the different species. I will in the end have many species, so need it to be automatic I can't enter code for every species separate. Can anyone help me with this? Or if this is the wrong list to sendt this question to, where do I send it? Thank you very much in advance. Best regards Silje Ramsvatn PhD-candidate University of Tromsø Norway __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting data from one column with strings
On Nov 4, 2010, at 8:28 AM, Ramsvatn Silje wrote: Hello, I have tried to find this out some other way, but unsuccessful I have to try this list. I assume this should be quite simple. I have a dataset with 4 columns, Sample_no, Species, Nitrogen, Carbon in csv format. In the species column I have many different species with varying number of obs per species Eg Sample_no Species NitrogenCarbon 1 Cod 15.2-19.0 2 Haddock 14.8-20.2 3 Cod 15.6-18.5 4 Cod 13.2-20.1 5 Haddock 14.3-18.8 Etc.. And I want to calculate, mean, standard dev etc per species for the observations Nitrogen and Carbon. And later do plots and stats with the different species. I will in the end have many species, so need it to be automatic I can't enter code for every species separate. http://finzi.psych.upenn.edu/R/library/prettyR/html/brkdn.html http://finzi.psych.upenn.edu/R/library/Hmisc/html/describe.html e.g library(Hmisc) with( dfrm, describe( ~Species) ) I think you could also probably do lapply(split(dfrm, dfrm$species), describe) the Hmisc::describe function is especially good at first examining a vector and applying the appropriate methods to the type of data. There are several other packages with different describe functions. And there are several other packages such as doBy and plyr that will offer other concise methods for doing your by-category statistics. -- David. Can anyone help me with this? Or if this is the wrong list to sendt this question to, where do I send it? Thank you very much in advance. Best regards Silje Ramsvatn PhD-candidate University of Tromsø Norway __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multi-level cox ph with time-dependent covariates
Your question has two levels: 1. What is the right model for this data 2. Can model __ be fit Wrt 2 and coxme: For a reliable fit you need to have more events than random effects. Thus for patient/tissue I would want to see multiple events per patient/tissue pair. This is statistical issue -- when there are too few events the confidence intervals for the random effects end up being a mile wide. (Exception, if the number of events is very large, 10^5 say as sometimes occurs in economics studies, the estimates can work.) coxme works fine with start,stop data. Wrt question 1. Your models assume that marker1, marker2, ... each have the same effect across tissue types. Adding a random effect gave per subject or per subject/tissue intercepts. Do you instead want to do shrinkage of the marker1, .. coefficients? Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Sorting data from one column with strings
Hi r-help-boun...@r-project.org napsal dne 04.11.2010 13:28:06: Hello, I have tried to find this out some other way, but unsuccessful I have to try this list. I assume this should be quite simple. I have a dataset with 4 columns, Sample_no, Species, Nitrogen, Carbon in csv format. In the species column I have many different species with varying number of obs per species Eg Sample_no Species Nitrogen Carbon 1 Cod 15.2 -19.0 2 Haddock 14.8 -20.2 3 Cod 15.6 -18.5 4 Cod 13.2 -20.1 5 Haddock 14.3 -18.8 Etc.. And I want to calculate, mean, standard dev etc per species for the observations Nitrogen and Carbon. And later do plots and stats with the different species. I will in the end have many species, so need it to be automatic I can't enter code for every species separate. No need for sorting. You can us R. Particularly ?tapply, ?by or ?aggregate commands. Regarding plots you can consider lattice or ggplot2, but you can get good results also with base graphics. aggregate(your.data[,3:4], list(yourdata$Species), function(x) c(mean(x), sd(x))) xyplot(nitrogen~carbon|species, data=your.data) Regards Petr Can anyone help me with this? Or if this is the wrong list to sendt this question to, where do I send it? Thank you very much in advance. Best regards Silje Ramsvatn PhD-candidate University of Tromsø Norway __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to set initial values in lme ?
Hello R-users, Does anyone know how to set initial values in lme? I have a problem of non convergence and would like to try different inital values. Is lmeScale the right function to do it and how many parameters do we need to specify, only fixed parameters or also random effects ? Thanks for any advice, Best regards, -- THAI Hoai Thu INSERM U738 - Université Paris 7 16 rue Henri Huchard 75018 Paris, FRANCE Tel: 01 57 27 75 39 Email: hoai-thu.t...@inserm.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop
Hi David, I am still having troubles with that loop ... This code gives me (kinda) the name of the column/field in a data frame. Filed names are form W1-W10. But there is a space between W and a number -- W 10, and column (field) names do not contain numbers. for(i in 1:10) { vari - paste(W,i) } vari [1] W 10 Now as i understand than i would call different columns to R with w-lit[[vari]] Or am i wrong again? Then I would probably need another loop to create the names of the variables on R, i.e. w1 to w10. Is that a general idea for the procedure? Thank for the help, m -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Wednesday, November 03, 2010 10:41 PM To: Matevž Pavlič Cc: r-help@r-project.org Subject: Re: [R] Loop On Nov 3, 2010, at 5:03 PM, Matevž Pavlič wrote: Hi, Thanks for the help and the manuals. Will come very handy i am sure. But regarding the code i don't hink this is what i wantbasically i would like to repeat bellow code : w1-table(lit$W1) w1-as.data.frame(w1) It appears you are not reading for meaning. Burns has advised you how to construct column names and use them in your initial steps. The `$` function is quite limited in comparison to `[[` , so he was showing you a method that would be more effective. BTW the as.data.frame step is unnecessary, since the first thing write.table does is coerce an object to a data.frame. The write.table name is misleading. It should be write.data.frame. You cannot really write tables with write.table. You would also use: file=paste(vari, csv, sep=.) as the file argument to write.table write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) What are these next actions supposed to do after the file is written? Are you trying to store a group of related w objects that will later be indexed in sequence? If so, then a list would make more sense. -- David. w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) 20 times, where W1-20 (capital letters) are the fields in a data.frame called lit and w1-20 are the data.frames being created. Hope that explains it better, m -Original Message- From: Patrick Burns [mailto:pbu...@pburns.seanet.com] Subject: Re: [R] Loop If I understand properly, you'll want something like: lit[[w2]] instead of lit$w2 more accurately: for(i in 1:20) { vari - paste(w, i) lit[[vari]] ... } The two documents mentioned in my signature may help you. On 03/11/2010 20:23, Matevž Pavlič wrote: Hi all, I managed to do what i want (with the great help of thi mailing list) manually . Now i would like to automate it. I would probably need a for loop for to help me with this...but of course I have no idea how to do that in R. Bellow is the code that i would like to be replicated for a number of times (let say 20). I would like to achieve that w1 would change to w2, w3, w4 ... up to w20 and by that create 20 data.frames that I would than bind together with cbind. (i did it like shown bellow -manually) w1-table(lit$W1) w1-as.data.frame(w1) write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) w2-table(lit$W2) w2-as.data.frame(w2) write.table(w2,file=w2.csv,sep=;,row.names=T, dec=.) w2- w2[order(w2$Freq, decreasing=TRUE),] w2-head(w2, 20) . . . Thanks for the help,m David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop
On Nov 4, 2010, at 9:21 AM, Matevž Pavlič wrote: Hi David, I am still having troubles with that loop ... This code gives me (kinda) the name of the column/field in a data frame. Filed names are form W1-W10. But there is a space between W and a number -- W 10, and column (field) names do not contain numbers. for(i in 1:10) { vari - paste(W,i) Should be: vari - paste(w, i, sep=) } vari [1] W 10 Now as i understand than i would call different columns to R with w-lit[[vari]] Or am i wrong again? Maybe. Since you overwrote the first nine values there is only one element in vari outside the loop. I would do the assignment inside the loop and I suggested that the results be store in a list that is indexed either by vari or by i (but without the quotes if you are typing lit[[vari]] -- David. Then I would probably need another loop to create the names of the variables on R, i.e. w1 to w10. Is that a general idea for the procedure? Thank for the help, m -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Wednesday, November 03, 2010 10:41 PM To: Matevž Pavlič Cc: r-help@r-project.org Subject: Re: [R] Loop On Nov 3, 2010, at 5:03 PM, Matevž Pavlič wrote: Hi, Thanks for the help and the manuals. Will come very handy i am sure. But regarding the code i don't hink this is what i wantbasically i would like to repeat bellow code : w1-table(lit$W1) w1-as.data.frame(w1) It appears you are not reading for meaning. Burns has advised you how to construct column names and use them in your initial steps. The `$` function is quite limited in comparison to `[[` , so he was showing you a method that would be more effective. BTW the as.data.frame step is unnecessary, since the first thing write.table does is coerce an object to a data.frame. The write.table name is misleading. It should be write.data.frame. You cannot really write tables with write.table. You would also use: file=paste(vari, csv, sep=.) as the file argument to write.table write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) What are these next actions supposed to do after the file is written? Are you trying to store a group of related w objects that will later be indexed in sequence? If so, then a list would make more sense. -- David. w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) 20 times, where W1-20 (capital letters) are the fields in a data.frame called lit and w1-20 are the data.frames being created. Hope that explains it better, m -Original Message- From: Patrick Burns [mailto:pbu...@pburns.seanet.com] Subject: Re: [R] Loop If I understand properly, you'll want something like: lit[[w2]] instead of lit$w2 more accurately: for(i in 1:20) { vari - paste(w, i) lit[[vari]] ... } The two documents mentioned in my signature may help you. On 03/11/2010 20:23, Matevž Pavlič wrote: Hi all, I managed to do what i want (with the great help of thi mailing list) manually . Now i would like to automate it. I would probably need a for loop for to help me with this...but of course I have no idea how to do that in R. Bellow is the code that i would like to be replicated for a number of times (let say 20). I would like to achieve that w1 would change to w2, w3, w4 ... up to w20 and by that create 20 data.frames that I would than bind together with cbind. (i did it like shown bellow -manually) w1-table(lit$W1) w1-as.data.frame(w1) write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) w2-table(lit$W2) w2-as.data.frame(w2) write.table(w2,file=w2.csv,sep=;,row.names=T, dec=.) w2- w2[order(w2$Freq, decreasing=TRUE),] w2-head(w2, 20) . . . Thanks for the help,m David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting data from one column with strings
Try tapply(). For example: tapply(data$Nitrogen,factor(data$Species),mean) For the Nitrogen column, the mean is calculated for each Species. (if the data frame below is in the object data) Regards, Annemarie Eigenhuis -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ramsvatn Silje Sent: donderdag 4 november 2010 13:28 To: R-help@r-project.org Subject: [R] Sorting data from one column with strings Hello, I have tried to find this out some other way, but unsuccessful I have to try this list. I assume this should be quite simple. I have a dataset with 4 columns, Sample_no, Species, Nitrogen, Carbon in csv format. In the species column I have many different species with varying number of obs per species Eg Sample_no Species Nitrogen Carbon 1 Cod 15.2-19.0 2 Haddock 14.8-20.2 3 Cod 15.6-18.5 4 Cod 13.2-20.1 5 Haddock 14.3-18.8 Etc.. And I want to calculate, mean, standard dev etc per species for the observations Nitrogen and Carbon. And later do plots and stats with the different species. I will in the end have many species, so need it to be automatic I can't enter code for every species separate. Can anyone help me with this? Or if this is the wrong list to sendt this question to, where do I send it? Thank you very much in advance. Best regards Silje Ramsvatn PhD-candidate University of Tromsø Norway __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
hello,i'm roesda from indonesia I have trouble when they have to perform parameter estimation by MLE method using the R programming.because, the distribution that will be used instead of not like the distribution that already known distributions such as gamma distribution, Poisson or binomial. the distribution that i would estimate the parameters are the joint distribution between the negative binomial distribution and Lindley. how do I translate it in R if the distribution is still new as I mentioned? i hope everyone can help me. thank you very much Simak Baca secara fonetik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot output
Dear All, I have this script: dat - data.frame(Month = hstat$Date,C_avg = hstat$C.avg,C_stdev = hstat$C.stdev) ggplot(data = dat, aes(x = Month, y = C_avg, ymin = C_avg - C_stdev, ymax = C_avg + C_stdev)) + geom_point() + geom_line() + geom_errorbar() dat - data.frame(Month = hstat$Date,K_avg = hstat$K.avg,K_stdev = hstat$K.stdev) ggplot(data = dat, aes(x = Month, y = K_avg, ymin = K_avg - K_stdev, ymax = K_avg + K_stdev)) + geom_point() + geom_line() + geom_errorbar() dat - data.frame(Month = hstat$Date,S_avg = hstat$S.avg,S_stdev = hstat$S.stdev) ggplot(data = dat, aes(x = Month, y = S_avg, ymin = S_avg - S_stdev, ymax = S_avg + S_stdev)) + geom_point() + geom_line() + geom_errorbar() Running the script generates 3 separate graphs, how can I output them next to each other? Thanks -- View this message in context: http://r.789695.n4.nabble.com/ggplot-output-tp3027026p3027026.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot output
Have you tried ?split.screen Annemarie Eigenhuis, MSc University of Amsterdam Department of Psychology, clinical area Roetersstraat 15 1018 WB Amsterdam The Netherlands phone: +31(0)205256815 email: a.eigenh...@uva.nl -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of ashz Sent: donderdag 4 november 2010 14:37 To: r-help@r-project.org Subject: [R] ggplot output Dear All, I have this script: dat - data.frame(Month = hstat$Date,C_avg = hstat$C.avg,C_stdev = hstat$C.stdev) ggplot(data = dat, aes(x = Month, y = C_avg, ymin = C_avg - C_stdev, ymax = C_avg + C_stdev)) + geom_point() + geom_line() + geom_errorbar() dat - data.frame(Month = hstat$Date,K_avg = hstat$K.avg,K_stdev = hstat$K.stdev) ggplot(data = dat, aes(x = Month, y = K_avg, ymin = K_avg - K_stdev, ymax = K_avg + K_stdev)) + geom_point() + geom_line() + geom_errorbar() dat - data.frame(Month = hstat$Date,S_avg = hstat$S.avg,S_stdev = hstat$S.stdev) ggplot(data = dat, aes(x = Month, y = S_avg, ymin = S_avg - S_stdev, ymax = S_avg + S_stdev)) + geom_point() + geom_line() + geom_errorbar() Running the script generates 3 separate graphs, how can I output them next to each other? Thanks -- View this message in context: http://r.789695.n4.nabble.com/ggplot-output-tp3027026p3027026.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot output
The easiest way it to create one long dataset with four variables: Month, avg, stdev and type. Type will be either K, C or S. Then you just need to add some facetting to your code ggplot(data = dat, aes(x = Month, y = avg, ymin = avg - stdev, ymax = avg + stdev)) + geom_point() + geom_line() + geom_errorbar() + facet_wrap(~type, nrow = 1) HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens ashz Verzonden: donderdag 4 november 2010 14:37 Aan: r-help@r-project.org Onderwerp: [R] ggplot output Dear All, I have this script: dat - data.frame(Month = hstat$Date,C_avg = hstat$C.avg,C_stdev = hstat$C.stdev) ggplot(data = dat, aes(x = Month, y = C_avg, ymin = C_avg - C_stdev, ymax = C_avg + C_stdev)) + geom_point() + geom_line() + geom_errorbar() dat - data.frame(Month = hstat$Date,K_avg = hstat$K.avg,K_stdev = hstat$K.stdev) ggplot(data = dat, aes(x = Month, y = K_avg, ymin = K_avg - K_stdev, ymax = K_avg + K_stdev)) + geom_point() + geom_line() + geom_errorbar() dat - data.frame(Month = hstat$Date,S_avg = hstat$S.avg,S_stdev = hstat$S.stdev) ggplot(data = dat, aes(x = Month, y = S_avg, ymin = S_avg - S_stdev, ymax = S_avg + S_stdev)) + geom_point() + geom_line() + geom_errorbar() Running the script generates 3 separate graphs, how can I output them next to each other? Thanks -- View this message in context: http://r.789695.n4.nabble.com/ggplot-output-tp3027026p3027026.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot output
Split.screen() and par() don't work with ggplot2 ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Eigenhuis, Annemarie Verzonden: donderdag 4 november 2010 14:53 Aan: ashz; r-help@r-project.org Onderwerp: Re: [R] ggplot output Have you tried ?split.screen Annemarie Eigenhuis, MSc University of Amsterdam Department of Psychology, clinical area Roetersstraat 15 1018 WB Amsterdam The Netherlands phone: +31(0)205256815 email: a.eigenh...@uva.nl -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of ashz Sent: donderdag 4 november 2010 14:37 To: r-help@r-project.org Subject: [R] ggplot output Dear All, I have this script: dat - data.frame(Month = hstat$Date,C_avg = hstat$C.avg,C_stdev = hstat$C.stdev) ggplot(data = dat, aes(x = Month, y = C_avg, ymin = C_avg - C_stdev, ymax = C_avg + C_stdev)) + geom_point() + geom_line() + geom_errorbar() dat - data.frame(Month = hstat$Date,K_avg = hstat$K.avg,K_stdev = hstat$K.stdev) ggplot(data = dat, aes(x = Month, y = K_avg, ymin = K_avg - K_stdev, ymax = K_avg + K_stdev)) + geom_point() + geom_line() + geom_errorbar() dat - data.frame(Month = hstat$Date,S_avg = hstat$S.avg,S_stdev = hstat$S.stdev) ggplot(data = dat, aes(x = Month, y = S_avg, ymin = S_avg - S_stdev, ymax = S_avg + S_stdev)) + geom_point() + geom_line() + geom_errorbar() Running the script generates 3 separate graphs, how can I output them next to each other? Thanks -- View this message in context: http://r.789695.n4.nabble.com/ggplot-output-tp3027026p3027026.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with points in plots when importing from pdf to an SVG editor
Dear R-users When trying to import graphics from an pdf-file to a Vector graphics editor (I use Inkscape, but i've confirmed the same problem on adobe products), all points in the graphics turn out as qs. This example displays the beaviour: pdf(file=points are weird.pdf) plot(1:5) dev.off() When importing the file to inkscape, I get five neatly arranged little qs. The obvious workaround would be to change the points into another plotting character, but this isn't the first time i've encountered this behaviour, and it would be nice to solve it properly instead. I realize this might not strictly be a question related to R, but if someone who've encountered the problem has found a solution or workaround it would be greatly appreciated. Kind regards/ Rafael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding too many loops - reshaping data
Beware of facile comparisons of this sort -- they may be apples and nematodes. And they also imply that the main time sink is the computation. In my experience, figuring out how to solve the problem using takes considerably more time than 18 / 1000 seconds, and so investing your energy in learning idioms that apply in a wide range of situations is far more useful than figuring out the fastest solution to a single problem. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with points in plots when importing from pdf to an SVG editor
Hi, Try with RSvgDevice::devSVG() I don't have any problems with either Inkscape or Illustrator (CS4) HTH, Ivan Le 11/4/2010 15:04, Rafael Björk a écrit : Dear R-users When trying to import graphics from an pdf-file to a Vector graphics editor (I use Inkscape, but i've confirmed the same problem on adobe products), all points in the graphics turn out as qs. This example displays the beaviour: pdf(file=points are weird.pdf) plot(1:5) dev.off() When importing the file to inkscape, I get five neatly arranged little qs. The obvious workaround would be to change the points into another plotting character, but this isn't the first time i've encountered this behaviour, and it would be nice to solve it properly instead. I realize this might not strictly be a question related to R, but if someone who've encountered the problem has found a solution or workaround it would be greatly appreciated. Kind regards/ Rafael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot output
Dear Thierry, Your solution looks very elgant but I can not find a proper example. Can you provide me one? Thx -- View this message in context: http://r.789695.n4.nabble.com/ggplot-output-tp3027026p3027108.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop
Hi r-help-boun...@r-project.org napsal dne 04.11.2010 14:21:38: Hi David, I am still having troubles with that loop ... This code gives me (kinda) the name of the column/field in a data frame. Filed names are form W1-W10. But there is a space between W and a number -- W 10, and column (field) names do not contain numbers. for(i in 1:10) { vari - paste(W,i) } vari [1] W 10 Now as i understand than i would call different columns to R with w-lit[[vari]] Or am i wrong again? Then I would probably need another loop to create the names of the variables on R, i.e. w1 to w10. Is that a general idea for the procedure? Beware of such loops. Instead of littering your workspace with files/objects constructed by some paste(whatever, i) solution you can save results in list or data.frame or matrix and simply use basic subsetting procedures or lapply/sapply functions. I must say I never used such paste(...) construction yet and I work with R for quite a long time. Regards Petr Thank for the help, m -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Wednesday, November 03, 2010 10:41 PM To: Matevž Pavlič Cc: r-help@r-project.org Subject: Re: [R] Loop On Nov 3, 2010, at 5:03 PM, Matevž Pavlič wrote: Hi, Thanks for the help and the manuals. Will come very handy i am sure. But regarding the code i don't hink this is what i wantbasically i would like to repeat bellow code : w1-table(lit$W1) w1-as.data.frame(w1) It appears you are not reading for meaning. Burns has advised you how to construct column names and use them in your initial steps. The `$` function is quite limited in comparison to `[[` , so he was showing you a method that would be more effective. BTW the as.data.frame step is unnecessary, since the first thing write.table does is coerce an object to a data.frame. The write.table name is misleading. It should be write.data.frame. You cannot really write tables with write.table. You would also use: file=paste(vari, csv, sep=.) as the file argument to write.table write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) What are these next actions supposed to do after the file is written? Are you trying to store a group of related w objects that will later be indexed in sequence? If so, then a list would make more sense. -- David. w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) 20 times, where W1-20 (capital letters) are the fields in a data.frame called lit and w1-20 are the data.frames being created. Hope that explains it better, m -Original Message- From: Patrick Burns [mailto:pbu...@pburns.seanet.com] Subject: Re: [R] Loop If I understand properly, you'll want something like: lit[[w2]] instead of lit$w2 more accurately: for(i in 1:20) { vari - paste(w, i) lit[[vari]] ... } The two documents mentioned in my signature may help you. On 03/11/2010 20:23, Matevž Pavlič wrote: Hi all, I managed to do what i want (with the great help of thi mailing list) manually . Now i would like to automate it. I would probably need a for loop for to help me with this...but of course I have no idea how to do that in R. Bellow is the code that i would like to be replicated for a number of times (let say 20). I would like to achieve that w1 would change to w2, w3, w4 ... up to w20 and by that create 20 data.frames that I would than bind together with cbind. (i did it like shown bellow -manually) w1-table(lit$W1) w1-as.data.frame(w1) write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) w2-table(lit$W2) w2-as.data.frame(w2) write.table(w2,file=w2.csv,sep=;,row.names=T, dec=.) w2- w2[order(w2$Freq, decreasing=TRUE),] w2-head(w2, 20) . . . Thanks for the help,m David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Converting Strings to Variable names
Hi all, I am processing 24 samples data and combine them in single table called CombinedSamples using following: CombinedSamples-rbind(Sample1,Sample2,Sample3) Now variables Sample1, Sample2 and Sample3 have many different columns. To make it more flexible for other samples I'm replacing above code with a for loop: #Sample is a string vector containing all 24 sample names for (k in 1:length(Sample)) { CombinedSamples-rbind(get(Sample[k])) } This code only stores last sample data as CombinedSample gets overwritten every time. Using CombinedSamples[k] or CombinedSamples[k,] causes dimension related errors as each Sample has several rows and not just 24. So how can I assign data of all 24 samples to CombinedSamples? Thanks, Anand [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to do bootstrap for the complex sample design?
At 01:38 AM 11/4/2010, Fei xu wrote: Hello; Our survey is structured as : To be investigated area is divided into 6 regions, within each region, one urban community and one rural community are randomly selected, then samples are randomly drawn from each selected uran and rural community. The problems is that in urban/rural stratum, we only have one sample. In this case, how to do bootstrap? Any comments or hints are greatly appreciated! Faye Just make a table of your data, with each row corresponding to a measurement. You columns will be Region, UrbanCommunity, RuralCommunity and your response variables. Bootstrap resampling is just generating random row indices into this table, with replacement. I.e., index- sample(1:N, N, replace=TRUE) Then your resample is myTable[index,]. Because you chose UrbanCommunity and RuralCommunity randomly, this shouldn't be a problem. The fact that you choose a subsample size of 1 means you won't be able to estimate within-region variances unless you make some serious assumptions (e.g., UrbanCommunity effect independent of Region effect). Robert A. LaBudde, PhD, PAS, Dpl. ACAFS e-mail: r...@lcfltd.com Least Cost Formulations, Ltd.URL: http://lcfltd.com/ 824 Timberlake Drive Tel: 757-467-0954 Virginia Beach, VA 23464-3239Fax: 757-467-2947 Vere scire est per causas scire __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot output
Have a look at the ggplot2 website. It has a lot of examples http://had.co.nz/ggplot2/ look at the bottom of this page for facet_grid() and facet_wrap() http://had.co.nz/ggplot2/facet_wrap.html direct link to facet_wrap() ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens ashz Verzonden: donderdag 4 november 2010 15:32 Aan: r-help@r-project.org Onderwerp: Re: [R] ggplot output Dear Thierry, Your solution looks very elgant but I can not find a proper example. Can you provide me one? Thx -- View this message in context: http://r.789695.n4.nabble.com/ggplot-output-tp3027026p3027108.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop
Hi all, I understand that you most of you this is a peice of cake but i am a complete newbie in thisso any example would be greatly aprpeciated and also any hint as how to get around in R. Frankly i sometimes see the help files kinda confusing. M -Original Message- From: Petr PIKAL [mailto:petr.pi...@precheza.cz] Sent: Thursday, November 04, 2010 3:40 PM To: Matevž Pavlič Cc: r-help@r-project.org Subject: Re: [R] Loop Hi r-help-boun...@r-project.org napsal dne 04.11.2010 14:21:38: Hi David, I am still having troubles with that loop ... This code gives me (kinda) the name of the column/field in a data frame. Filed names are form W1-W10. But there is a space between W and a number -- W 10, and column (field) names do not contain numbers. for(i in 1:10) { vari - paste(W,i) } vari [1] W 10 Now as i understand than i would call different columns to R with w-lit[[vari]] Or am i wrong again? Then I would probably need another loop to create the names of the variables on R, i.e. w1 to w10. Is that a general idea for the procedure? Beware of such loops. Instead of littering your workspace with files/objects constructed by some paste(whatever, i) solution you can save results in list or data.frame or matrix and simply use basic subsetting procedures or lapply/sapply functions. I must say I never used such paste(...) construction yet and I work with R for quite a long time. Regards Petr Thank for the help, m -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Wednesday, November 03, 2010 10:41 PM To: Matevž Pavlič Cc: r-help@r-project.org Subject: Re: [R] Loop On Nov 3, 2010, at 5:03 PM, Matevž Pavlič wrote: Hi, Thanks for the help and the manuals. Will come very handy i am sure. But regarding the code i don't hink this is what i wantbasically i would like to repeat bellow code : w1-table(lit$W1) w1-as.data.frame(w1) It appears you are not reading for meaning. Burns has advised you how to construct column names and use them in your initial steps. The `$` function is quite limited in comparison to `[[` , so he was showing you a method that would be more effective. BTW the as.data.frame step is unnecessary, since the first thing write.table does is coerce an object to a data.frame. The write.table name is misleading. It should be write.data.frame. You cannot really write tables with write.table. You would also use: file=paste(vari, csv, sep=.) as the file argument to write.table write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) What are these next actions supposed to do after the file is written? Are you trying to store a group of related w objects that will later be indexed in sequence? If so, then a list would make more sense. -- David. w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) 20 times, where W1-20 (capital letters) are the fields in a data.frame called lit and w1-20 are the data.frames being created. Hope that explains it better, m -Original Message- From: Patrick Burns [mailto:pbu...@pburns.seanet.com] Subject: Re: [R] Loop If I understand properly, you'll want something like: lit[[w2]] instead of lit$w2 more accurately: for(i in 1:20) { vari - paste(w, i) lit[[vari]] ... } The two documents mentioned in my signature may help you. On 03/11/2010 20:23, Matevž Pavlič wrote: Hi all, I managed to do what i want (with the great help of thi mailing list) manually . Now i would like to automate it. I would probably need a for loop for to help me with this...but of course I have no idea how to do that in R. Bellow is the code that i would like to be replicated for a number of times (let say 20). I would like to achieve that w1 would change to w2, w3, w4 ... up to w20 and by that create 20 data.frames that I would than bind together with cbind. (i did it like shown bellow -manually) w1-table(lit$W1) w1-as.data.frame(w1) write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) w2-table(lit$W2) w2-as.data.frame(w2) write.table(w2,file=w2.csv,sep=;,row.names=T, dec=.) w2- w2[order(w2$Freq, decreasing=TRUE),] w2-head(w2, 20) . . . Thanks for the help,m David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained,
[R] removing indexing
R-help, I was wondering how to remove indexing from an output, e.g., aVector-1:10 aVector [1] 1 2 3 4 5 6 7 8 9 10 someFunction(aVector) 1 2 3 4 5 6 7 8 9 10 Thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to do bootstrap for the complex sample design?
Faye wrote: Our survey is structured as : To be investigated area is divided into 6 regions, within each region, one urban community and one rural community are randomly selected, then samples are randomly drawn from each selected uran and rural community. The problems is that in urban/rural stratum, we only have one sample. In this case, how to do bootstrap? You are lucky that your sample size is 1. If it were 2 you would probably have proceeded without realizing that the answers were wrong. Suppose you had two samples in each stratum. If you proceed naturally, drawing bootstrap samples of size 2 from each stratum, this would underestimate variability by a factor of 2. In general the ordinary nonparametric bootstrap estimates of variability are biased downward by a factor of (n-1)/n -- exactly for the mean, approximately for other statistics. In multiple-sample and stratified situations, the bias depends on the stratum sizes. Three remedies are: * draw bootstrap samples of size n-1 * bootknife sampling - omit one observation (a jackknife sample), then draw a bootstrap sample of size n from that * bootstrap from a kernel density estimate, with kernel covariance equal to empirical covariance (with divisor n-1) / n. The latter two are described in Hesterberg, Tim C. (2004), Unbiasing the Bootstrap-Bootknife Sampling vs. Smoothing, Proceedings of the Section on Statistics and the Environment, American Statistical Association, 2924-2930. http://home.comcast.net/~timhesterberg/articles/JSM04-bootknife.pdf All three are undefined for samples of size 1. You need to go to some other bootstrap, e.g. a parametric bootstrap with variability estimated from other data. Tim Hesterberg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] removing indexing
On 04/11/2010 10:53 AM, Luis Ridao wrote: R-help, I was wondering how to remove indexing from an output, e.g., aVector-1:10 aVector [1] 1 2 3 4 5 6 7 8 9 10 someFunction(aVector) 1 2 3 4 5 6 7 8 9 10 The cat() function gives you lots of flexibility in how things print. Your example is a little complicated, because you appear to want the spacing that print() gives without the indexing, which probably means using sprintf() or format(). If that's just a relic of cut and paste, then cat(aVector, \n) is fine. (The \n is optional; it goes to a new line.) If you really want the fancy spacing, something like cat(format(aVector), \n) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot output
The other way (in the same spirit as par(mfrow = ...) in base graphics) is to use the grid.arrange function in the gridExtra package. See it's documentation for examples. On Nov 4, 2010, at 9:36 AM, ashz wrote: Dear All, I have this script: dat - data.frame(Month = hstat$Date,C_avg = hstat$C.avg,C_stdev = hstat$C.stdev) ggplot(data = dat, aes(x = Month, y = C_avg, ymin = C_avg - C_stdev, ymax = C_avg + C_stdev)) + geom_point() + geom_line() + geom_errorbar() dat - data.frame(Month = hstat$Date,K_avg = hstat$K.avg,K_stdev = hstat$K.stdev) ggplot(data = dat, aes(x = Month, y = K_avg, ymin = K_avg - K_stdev, ymax = K_avg + K_stdev)) + geom_point() + geom_line() + geom_errorbar() dat - data.frame(Month = hstat$Date,S_avg = hstat$S.avg,S_stdev = hstat$S.stdev) ggplot(data = dat, aes(x = Month, y = S_avg, ymin = S_avg - S_stdev, ymax = S_avg + S_stdev)) + geom_point() + geom_line() + geom_errorbar() Running the script generates 3 separate graphs, how can I output them next to each other? Thanks -- View this message in context: http://r.789695.n4.nabble.com/ggplot-output-tp3027026p3027026.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] density() function: differences with S-PLUS
I suspect that R's help(density) will tell about the difference between its bw and width arguments. In Splus help(density) says about width width width of the window. ... The standard error of a Gaussian window is width/4. For the other windows width is the width of the interval on which the window is non-zero. I believe R's bw argument is the standard deviation of the density used for the kernel. In R 'width' has the same meaning as in S+. The small difference between estimates when using the same bandwidth is mainly due to S+ using a truncated gaussian kernel (at 4 standard deviations out) and R not truncating the kernal. Part of the difference is due to R using the Fourier transform to do the convolution of the kernel and the data, while S+ uses a direct approach. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com From: mailingl...@sturaro.net [mailto:mailingl...@sturaro.net] On Behalf Of Nicola Sturaro Sommacal (Quantide srl) Sent: Thursday, November 04, 2010 2:36 AM To: William Dunlap Subject: Re: [R] density() function: differences with S-PLUS Dear William, I obtained the same x values also without the from= and to= argument, using bw instead width in R. At this point I try to use a two step procedure for the y: - in the first step I obtained the x as below, - in the second step I used the minimum and the maximum values for the x as from= and to= arguments. In this way I obtain, in R, y values close to the S+ ones, but not the same. R code and S+ code and output are below. Thanks again. Nicola # R CODE exdata = iris$Sepal.Length[iris$Species == setosa] density(exdata, bw = 4, n = 50, cut = 0.75)$x # SAME AS S+ density(exdata, bw = 4, n = 50, cut = 0.75)$y # COMPLETELY DIFFERENT density(exdata, width = 4, n = 50, from = 1.3, to = 8.8, cut = 0.75)$y # CLOSE TO S+ # SPLUS CODE AND OUTPUT exdata = iris[, 1, 1] density(exdata, width = 4) $x: [1] 1.30 1.453061 1.606122 1.759184 1.912245 2.065306 [7] 2.218367 2.371429 2.524490 2.677551 2.830612 2.983673 [13] 3.136735 3.289796 3.442857 3.595918 3.748980 3.902041 [19] 4.055102 4.208163 4.361224 4.514286 4.667347 4.820408 [25] 4.973469 5.126531 5.279592 5.432653 5.585714 5.738776 [31] 5.891837 6.044898 6.197959 6.351020 6.504082 6.657143 [37] 6.810204 6.963265 7.116327 7.269388 7.422449 7.575510 [43] 7.728571 7.881633 8.034694 8.187755 8.340816 8.493878 [49] 8.646939 8.80 $y: [1] 0.0007849649 0.0013097474 0.0021225491 0.0033616520 [5] 0.0052059615 0.0078856717 0.0116917555 0.0169685132 [9] 0.0241073754 0.0335286785 0.0456521053 0.0608554862 [13] 0.0794235072 0.1014901241 0.1269807991 0.1555625999 [17] 0.1866111931 0.2192033788 0.2521417640 0.2840144993 [21] 0.3132881074 0.3384260582 0.3580208688 0.3709241384 [25] 0.3763578665 0.3739920600 0.3639778683 0.3469316232 [29] 0.3238721233 0.2961200278 0.2651731505 0.2325739601 [33] 0.1997853985 0.1680884651 0.1385105802 0.1117884914 [37] 0.0883644110 0.0684099972 0.0518702141 0.0385181792 [41] 0.0280126487 0.0199513951 0.0139159044 0.0095050745 [45] 0.0063575653 0.0041639082 0.0026680819 0.0016700727 [49] 0.0010169912 0.0005962089 2010/11/3 William Dunlap wdun...@tibco.com Did you get my reply (1:31pm PST Tuesday) to your request? It showed how you needed to use the from= and to= argument to density to get identical x components to the output and that the small differences in the y component were due to S+ truncating the gaussian kernel at +- 4 standard deviations from the center while R does not truncate the gaussian kernel (it output looks like it uses a Fourier transform to do the convolution). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Nicola Sturaro Sommacal (Quantide srl) Sent: Wednesday, November 03, 2010 3:34 AM To: Joshua Wiley Cc: r-help@r-project.org Subject: Re: [R] density() function: differences with S-PLUS Dear Joshua, first of all, thank you very much for reply. I hoped that
Re: [R] Problems with points in plots when importing from pdf to an SVG editor
Just read the help page :). This is under Note in the ?pdf. On some systems the default plotting character ‘pch = 1’ is displayed in some PDF viewers incorrectly as a ‘q’ character. (These seem to be viewers based on the ‘poppler’ PDF rendering library). This may be due to incorrect or incomplete mapping of font names to those used by the system. Adding the following lines to ‘~/.fonts.conf’ or ‘/etc/fonts/local.conf’ may circumvent this problem. alias binding=same familyZapfDingbats/family acceptfamilyDingbats/family/accept /alias I've found that in my case, this happens when viewing a PDF with that plotting character under old versions of Evince, but not newer. --Erik Rafael Björk wrote: Dear R-users When trying to import graphics from an pdf-file to a Vector graphics editor (I use Inkscape, but i've confirmed the same problem on adobe products), all points in the graphics turn out as qs. This example displays the beaviour: pdf(file=points are weird.pdf) plot(1:5) dev.off() When importing the file to inkscape, I get five neatly arranged little qs. The obvious workaround would be to change the points into another plotting character, but this isn't the first time i've encountered this behaviour, and it would be nice to solve it properly instead. I realize this might not strictly be a question related to R, but if someone who've encountered the problem has found a solution or workaround it would be greatly appreciated. Kind regards/ Rafael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop
Hello, The best way to get help from people on the list is for you to give us *reproducible* examples of exactly what is you want. Usually, you can come up with some sample data and code that corresponds to your situation, and that we can run directly by cutting and pasting from the email. You can find more details about this in the posting guide, linked to at the bottom of every email. Best, --Erik Matevž Pavlič wrote: Hi all, I understand that you most of you this is a peice of cake but i am a complete newbie in thisso any example would be greatly aprpeciated and also any hint as how to get around in R. Frankly i sometimes see the help files kinda confusing. M -Original Message- From: Petr PIKAL [mailto:petr.pi...@precheza.cz] Sent: Thursday, November 04, 2010 3:40 PM To: Matevž Pavlič Cc: r-help@r-project.org Subject: Re: [R] Loop Hi r-help-boun...@r-project.org napsal dne 04.11.2010 14:21:38: Hi David, I am still having troubles with that loop ... This code gives me (kinda) the name of the column/field in a data frame. Filed names are form W1-W10. But there is a space between W and a number -- W 10, and column (field) names do not contain numbers. for(i in 1:10) { vari - paste(W,i) } vari [1] W 10 Now as i understand than i would call different columns to R with w-lit[[vari]] Or am i wrong again? Then I would probably need another loop to create the names of the variables on R, i.e. w1 to w10. Is that a general idea for the procedure? Beware of such loops. Instead of littering your workspace with files/objects constructed by some paste(whatever, i) solution you can save results in list or data.frame or matrix and simply use basic subsetting procedures or lapply/sapply functions. I must say I never used such paste(...) construction yet and I work with R for quite a long time. Regards Petr Thank for the help, m -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Wednesday, November 03, 2010 10:41 PM To: Matevž Pavlič Cc: r-help@r-project.org Subject: Re: [R] Loop On Nov 3, 2010, at 5:03 PM, Matevž Pavlič wrote: Hi, Thanks for the help and the manuals. Will come very handy i am sure. But regarding the code i don't hink this is what i wantbasically i would like to repeat bellow code : w1-table(lit$W1) w1-as.data.frame(w1) It appears you are not reading for meaning. Burns has advised you how to construct column names and use them in your initial steps. The `$` function is quite limited in comparison to `[[` , so he was showing you a method that would be more effective. BTW the as.data.frame step is unnecessary, since the first thing write.table does is coerce an object to a data.frame. The write.table name is misleading. It should be write.data.frame. You cannot really write tables with write.table. You would also use: file=paste(vari, csv, sep=.) as the file argument to write.table write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) What are these next actions supposed to do after the file is written? Are you trying to store a group of related w objects that will later be indexed in sequence? If so, then a list would make more sense. -- David. w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) 20 times, where W1-20 (capital letters) are the fields in a data.frame called lit and w1-20 are the data.frames being created. Hope that explains it better, m -Original Message- From: Patrick Burns [mailto:pbu...@pburns.seanet.com] Subject: Re: [R] Loop If I understand properly, you'll want something like: lit[[w2]] instead of lit$w2 more accurately: for(i in 1:20) { vari - paste(w, i) lit[[vari]] ... } The two documents mentioned in my signature may help you. On 03/11/2010 20:23, Matevž Pavlič wrote: Hi all, I managed to do what i want (with the great help of thi mailing list) manually . Now i would like to automate it. I would probably need a for loop for to help me with this...but of course I have no idea how to do that in R. Bellow is the code that i would like to be replicated for a number of times (let say 20). I would like to achieve that w1 would change to w2, w3, w4 ... up to w20 and by that create 20 data.frames that I would than bind together with cbind. (i did it like shown bellow -manually) w1-table(lit$W1) w1-as.data.frame(w1) write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) w2-table(lit$W2) w2-as.data.frame(w2) write.table(w2,file=w2.csv,sep=;,row.names=T, dec=.) w2- w2[order(w2$Freq, decreasing=TRUE),] w2-head(w2, 20) . . . Thanks for the help,m David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting Strings to Variable names
Anand Bambhania wrote: Hi all, I am processing 24 samples data and combine them in single table called CombinedSamples using following: CombinedSamples-rbind(Sample1,Sample2,Sample3) Please use reproducible examples. Now variables Sample1, Sample2 and Sample3 have many different columns. Then you can't 'rbind' them, correct? From ?rbind: If there are several matrix arguments, they must all have the same number of columns (or rows) and this will be the number of columns (or rows) of the result. To make it more flexible for other samples I'm replacing above code with a for loop: #Sample is a string vector containing all 24 sample names for (k in 1:length(Sample)) { CombinedSamples-rbind(get(Sample[k])) } This code only stores last sample data as CombinedSample gets overwritten every time. Using CombinedSamples[k] or CombinedSamples[k,] causes dimension related errors as each Sample has several rows and not just 24. So how can I assign data of all 24 samples to CombinedSamples? I don't know since I'm unsure of the structure of these objects. If they all have the same structure, I'd store them in a list and do: CombinedSamples - do.call(rbind, sampleList) otherwise perhaps using ?Reduce and ?merge. If you can provide a more complete example to the list, please do. You need not resort to a for loop/get hack for this. Best, --Erik Thanks, Anand [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] postForm() in RCurl and library RHTMLForms
I dont have the implementation in the way you want it . Sorry but someone here will definitely know The group showed me to do it this way though . library(zoo) library(RCurl) sNiftyURL = http://nseindia.com/content/indices/histdata/SP%20CNX%20NIFTY01-01-2000-02 -11-2010.csv Nifty_Dat = getURLContent(sNiftyURL, verbose = TRUE, useragent = getOption(HTTPUserAgent)) tblNifty - read.csv(textConnection(Nifty_Dat)) tblNifty - subset(tblNifty,select=c(Date,Close)) tblNifty$Date - as.Date(tblNifty$Date, format =%d-%b-%Y) tblNifty -read.zoo((tblNifty)) closeAllConnections() HTH. S From: sayan dasgupta [mailto:kitt...@gmail.com] Sent: 04 November 2010 15:09 To: r-help@r-project.org Cc: dun...@wald.ucdavis.edu; santosh.srini...@gmail.com Subject: postForm() in RCurl and library RHTMLForms Hi RUsers, Suppose I want to see the data on the website url - http://www.nseindia.com/content/indices/ind_histvalues.htm; for the index SP CNX NIFTY for dates FromDate=01-11-2010,ToDate=02-11-2010 then read the html table from the page using readHTMLtable() I am using this code webpage - postForm(url,.params=list( FromDate=01-11-2010, ToDate=02-11-2010, IndexType=SP CNX NIFTY, Indicesdata=Get Details), .opts=list(useragent = getOption(HTTPUserAgent))) But it doesn't give me desired result Also I was trying to use the function getHTMLFormDescription from the package RHTMLForms but there we can't use the argument .opts=list(useragent = getOption(HTTPUserAgent)) which is needed for this particular website Thanks and Regards Sayan Dasgupta __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with points in plots when importing from pdf to an SVG editor
Hi Erik! I googled and found that very helpful message on this adress: http://stat.ethz.ch/R-manual/R-devel/library/grDevices/html/pdf.html But when i type ?pdf I get directed here: http://127.0.0.1:29358/library/grDevices/html/pdf.html which doesn't contain that information. Thanks for the help 2010/11/4 Erik Iverson er...@ccbr.umn.edu Just read the help page :). This is under Note in the ?pdf. On some systems the default plotting character pch = 1 is displayed in some PDF viewers incorrectly as a q character. (These seem to be viewers based on the poppler PDF rendering library). This may be due to incorrect or incomplete mapping of font names to those used by the system. Adding the following lines to ~/.fonts.conf or /etc/fonts/local.conf may circumvent this problem. alias binding=same familyZapfDingbats/family acceptfamilyDingbats/family/accept /alias I've found that in my case, this happens when viewing a PDF with that plotting character under old versions of Evince, but not newer. --Erik Rafael Björk wrote: Dear R-users When trying to import graphics from an pdf-file to a Vector graphics editor (I use Inkscape, but i've confirmed the same problem on adobe products), all points in the graphics turn out as qs. This example displays the beaviour: pdf(file=points are weird.pdf) plot(1:5) dev.off() When importing the file to inkscape, I get five neatly arranged little qs. The obvious workaround would be to change the points into another plotting character, but this isn't the first time i've encountered this behaviour, and it would be nice to solve it properly instead. I realize this might not strictly be a question related to R, but if someone who've encountered the problem has found a solution or workaround it would be greatly appreciated. Kind regards/ Rafael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting Strings to Variable names
Hi Anand, Try creating a variable where you can store your data, and append it in your loop. See added lines of code to include below... On Thu, Nov 4, 2010 at 9:43 AM, Anand Bambhania amb1netwo...@gmail.comwrote: Hi all, I am processing 24 samples data and combine them in single table called CombinedSamples using following: CombinedSamples-rbind(Sample1,Sample2,Sample3) Now variables Sample1, Sample2 and Sample3 have many different columns. To make it more flexible for other samples I'm replacing above code with a for loop: #Sample is a string vector containing all 24 sample names #create a variable to stick your results res- NULL for (k in 1:length(Sample)) { CombinedSamples-rbind(get(Sample[k])) res-c(res, CombinedSamples) } Now, every iteration of your loop should append CombinedSamples to res, and you won't overwrite your results every time. HTH, Mike This code only stores last sample data as CombinedSample gets overwritten every time. Using CombinedSamples[k] or CombinedSamples[k,] causes dimension related errors as each Sample has several rows and not just 24. So how can I assign data of all 24 samples to CombinedSamples? Thanks, Anand [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting data from one column with strings
(apologies for any double hits; forgot to reply all...) Or, you could just go back to basics, and write yourself a general loop that goes through whatever levels of a variable and gives you back whatever statistics you want... below is an example where you estimate means for each level, but you could estimate any number of statistical parameters... dat-data.frame(c(rep(A,5), rep(B,5),rep(C,5)),c(1:15)) results-NULL for(i in levels(dat[,1])) { sub.dat-subset(dat, dat[,1]==i) res-mean(sub.dat[,2]) results-c(results,i,res) } results.mat-matrix(results, ncol=2, byrow=TRUE) results.mat HTH, Mike On Thu, Nov 4, 2010 at 7:28 AM, Ramsvatn Silje silje.ramsv...@uit.nowrote: Hello, I have tried to find this out some other way, but unsuccessful I have to try this list. I assume this should be quite simple. I have a dataset with 4 columns, Sample_no, Species, Nitrogen, Carbon in csv format. In the species column I have many different species with varying number of obs per species Eg Sample_no Species Nitrogen Carbon 1 Cod 15.2-19.0 2 Haddock 14.8-20.2 3 Cod 15.6-18.5 4 Cod 13.2-20.1 5 Haddock 14.3-18.8 Etc.. And I want to calculate, mean, standard dev etc per species for the observations Nitrogen and Carbon. And later do plots and stats with the different species. I will in the end have many species, so need it to be automatic I can't enter code for every species separate. Can anyone help me with this? Or if this is the wrong list to sendt this question to, where do I send it? Thank you very much in advance. Best regards Silje Ramsvatn PhD-candidate University of Tromsø Norway __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to work with long vectors
HI, Dear R community, I have one data set like this, What I want to do is to calculate the cumulative coverage. The following codes works for small data set (#rows = 100), but when feed the whole data set, it still running after 24 hours. Can someone give some suggestions for long vector? idreads Contig79:14 Contig79:28 Contig79:313 Contig79:414 Contig79:517 Contig79:620 Contig79:725 Contig79:827 Contig79:932 Contig79:1033 Contig79:1134 matt-read.table(/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth, sep=\t, skip=0, header=F,fill=T) # dim(matt) [1] 3384766 2 matt_plot-function(matt, outputfile) { names(matt)-c(id,reads) cover-matt$reads #calculate the cumulative coverage. + cover_per-function (data) { + output-numeric(0) + for (i in data) { + x-(100*sum(ifelse(data = i, 1, 0))/length(data)) + output-c(output, x) + } + return(output) + } result-cover_per(cover) Thanks so much! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matrix Manipulation
Hi, Is there a quick way to go from this matrix: A [,1] [,2] [,3] [1,]111 [2,]222 [3,]333 [4,]444 [5,]5 NA5 [6,] NA NA6 [7,] NA NA NA to this matrix: B [,1] [,2] [,3] [1,]1 NA NA [2,]2 NA1 [3,]312 [4,]423 [5,]534 [6,] NA45 [7,] NA NA6 without using a loop? For example using a vector which describes how many NA's are required from the top of the matrix- so in this case it would be c(0,2,1). Many thanks Emma -- View this message in context: http://r.789695.n4.nabble.com/Matrix-Manipulation-tp3027266p3027266.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting data from one column with strings
try sqldf: x Sample_no Species Nitrogen Carbon 1 1 Cod 15.2 -19.0 2 2 Haddock 14.8 -20.2 3 3 Cod 15.6 -18.5 4 4 Cod 13.2 -20.1 5 5 Haddock 14.3 -18.8 require(sqldf) sqldf(select Species, avg(Nitrogen) Nitrogen, avg(Carbon) Carbon from x group by Species) Species Nitrogen Carbon 1 Cod 14.7 -19.2 2 Haddock 14.55000 -19.5 On Thu, Nov 4, 2010 at 8:28 AM, Ramsvatn Silje silje.ramsv...@uit.no wrote: Hello, I have tried to find this out some other way, but unsuccessful I have to try this list. I assume this should be quite simple. I have a dataset with 4 columns, Sample_no, Species, Nitrogen, Carbon in csv format. In the species column I have many different species with varying number of obs per species Eg Sample_no Species Nitrogen Carbon 1 Cod 15.2 -19.0 2 Haddock 14.8 -20.2 3 Cod 15.6 -18.5 4 Cod 13.2 -20.1 5 Haddock 14.3 -18.8 Etc.. And I want to calculate, mean, standard dev etc per species for the observations Nitrogen and Carbon. And later do plots and stats with the different species. I will in the end have many species, so need it to be automatic I can't enter code for every species separate. Can anyone help me with this? Or if this is the wrong list to sendt this question to, where do I send it? Thank you very much in advance. Best regards Silje Ramsvatn PhD-candidate University of Tromsø Norway __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop
Hi r-help-boun...@r-project.org napsal dne 04.11.2010 15:49:31: Hi all, I understand that you most of you this is a peice of cake but i am a complete newbie in thisso any example would be greatly aprpeciated and also any hint as how to get around in R. Frankly i sometimes see the help files kinda confusing. OK. Instead of w1-table(lit$W1) w1-as.data.frame(w1) write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) Suppose you have data frame or matrix, and you want to have 5 most common values from each column # prepare matrix x-sample(1:20, 100, replace=T) mat-matrix(x, ncol=10) #apply user defined function for each column apply(mat, 2, function(x) head(sort(table(x), decreasing=T),5)) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 5091 5135 5174 5133 5199 5097 5165 5157 5134 5068 [2,] 5073 5111 5143 5064 5113 5078 5102 5157 5131 5065 [3,] 5058 5092 5115 5051 5079 5064 5088 5128 5076 5063 [4,] 5056 5073 5114 5047 5059 5044 5037 5064 5071 5063 [5,] 5047 5064 5072 5041 5057 5041 5035 5058 5032 5061 If you want to do it in loop (can be quicker sometimes) and save it to list make a list lll-vector(list, 10) and fill it with your results for (i in 1:10) lll[[i]]-head(sort(table(mat[,i]), decreasing=T),5) and now you can call values from this lll list simply by lll[5] [[1]] 9 15 136 16 5199 5113 5079 5059 5057 lll[[5]] 9 15 136 16 5199 5113 5079 5059 5057 or even lll[[5]][3] 13 5079 without need for writing to individual files pasting together letters and numbers etc. There shall be R-intro document in your installation and it is worth reading. It is not so big, you can manage it in less then month if you complete more than 3 pages per day. Regards Petr M -Original Message- From: Petr PIKAL [mailto:petr.pi...@precheza.cz] Sent: Thursday, November 04, 2010 3:40 PM To: Matevž Pavlič Cc: r-help@r-project.org Subject: Re: [R] Loop Hi r-help-boun...@r-project.org napsal dne 04.11.2010 14:21:38: Hi David, I am still having troubles with that loop ... This code gives me (kinda) the name of the column/field in a data frame. Filed names are form W1-W10. But there is a space between W and a number -- W 10, and column (field) names do not contain numbers. for(i in 1:10) { vari - paste(W,i) } vari [1] W 10 Now as i understand than i would call different columns to R with w-lit[[vari]] Or am i wrong again? Then I would probably need another loop to create the names of the variables on R, i.e. w1 to w10. Is that a general idea for the procedure? Beware of such loops. Instead of littering your workspace with files/objects constructed by some paste(whatever, i) solution you can save results in list or data.frame or matrix and simply use basic subsetting procedures or lapply/ sapply functions. I must say I never used such paste(...) construction yet and I work with R for quite a long time. Regards Petr Thank for the help, m -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Wednesday, November 03, 2010 10:41 PM To: Matevž Pavlič Cc: r-help@r-project.org Subject: Re: [R] Loop On Nov 3, 2010, at 5:03 PM, Matevž Pavlič wrote: Hi, Thanks for the help and the manuals. Will come very handy i am sure. But regarding the code i don't hink this is what i wantbasically i would like to repeat bellow code : w1-table(lit$W1) w1-as.data.frame(w1) It appears you are not reading for meaning. Burns has advised you how to construct column names and use them in your initial steps. The `$` function is quite limited in comparison to `[[` , so he was showing you a method that would be more effective. BTW the as.data.frame step is unnecessary, since the first thing write.table does is coerce an object to a data.frame. The write.table name is misleading. It should be write.data.frame. You cannot really write tables with write.table. You would also use: file=paste(vari, csv, sep=.) as the file argument to write.table write.table(w1,file=w1.csv,sep=;,row.names=T, dec=.) What are these next actions supposed to do after the file is written? Are you trying to store a group of related w objects that will later be indexed in sequence? If so, then a list would make more sense. -- David. w1- w1[order(w1$Freq, decreasing=TRUE),] w1-head(w1, 20) 20 times, where W1-20 (capital letters) are the fields in a data.frame called lit and w1-20 are the data.frames being created. Hope that explains it better, m -Original Message- From: Patrick Burns [mailto:pbu...@pburns.seanet.com] Subject: Re: [R] Loop If I understand properly, you'll want something like:
[R] Plotting a grid of directly specified colours
Dear R-help, Could any of you direct me to a function for plotting a grid of colours, directly specified by a matrix of hex colour codes? In other words I'm looking for a heatmap() or image()-like function to which I can specify the colour of each grid location directly, rather than providing a numerical matrix and a 1D-colour scale (heatmap, image, levelplots,NeatMap...). I'm surprised I haven't found anything simple with RSiteSearch, help.search, net. I'd like to use this function to encode one variable as chroma and a second as luminance (hcl colour space), so that the two variables can be visualised in a single heatmap (the variable are fold-change and a q-value, a significance measure). If anyone has any thoughts/warnings to offer re this idea then I'd love to hear them (it must have been tried before, but I've not come across any examples) . Best wishes and thank you, Peter Davenport [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix Manipulation
try this: x V2 V3 V4 [1,] 1 1 1 [2,] 2 2 2 [3,] 3 3 3 [4,] 4 4 4 [5,] 5 NA 5 [6,] NA NA 6 [7,] NA NA NA offset - c(0,2,1) # add the control to the data and make two copies so we can offset x.new - rbind(offset, x, x) result - apply(x.new, 2, function(.col){ + .col[seq(nrow(x) - .col[1L] + 2L, length = nrow(x))] + }) result V2 V3 V4 1 NA NA 2 NA 1 3 1 2 4 2 3 5 3 4 NA 4 5 NA NA 6 On Thu, Nov 4, 2010 at 11:47 AM, emj83 stp08...@shef.ac.uk wrote: Hi, Is there a quick way to go from this matrix: A [,1] [,2] [,3] [1,] 1 1 1 [2,] 2 2 2 [3,] 3 3 3 [4,] 4 4 4 [5,] 5 NA 5 [6,] NA NA 6 [7,] NA NA NA to this matrix: B [,1] [,2] [,3] [1,] 1 NA NA [2,] 2 NA 1 [3,] 3 1 2 [4,] 4 2 3 [5,] 5 3 4 [6,] NA 4 5 [7,] NA NA 6 without using a loop? For example using a vector which describes how many NA's are required from the top of the matrix- so in this case it would be c(0,2,1). Many thanks Emma -- View this message in context: http://r.789695.n4.nabble.com/Matrix-Manipulation-tp3027266p3027266.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix Manipulation
Many thanks-its worked a treat :-) Emma -- View this message in context: http://r.789695.n4.nabble.com/Matrix-Manipulation-tp3027266p3027307.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to work with long vectors
Is this what you want: x id reads 1 Contig79:1 4 2 Contig79:2 8 3 Contig79:313 4 Contig79:414 5 Contig79:517 6 Contig79:620 7 Contig79:725 8 Contig79:827 9 Contig79:932 10 Contig79:1033 11 Contig79:1134 x$percent - x$reads / max(x$reads) * 100 x id reads percent 1 Contig79:1 4 11.76471 2 Contig79:2 8 23.52941 3 Contig79:313 38.23529 4 Contig79:414 41.17647 5 Contig79:517 50.0 6 Contig79:620 58.82353 7 Contig79:725 73.52941 8 Contig79:827 79.41176 9 Contig79:932 94.11765 10 Contig79:1033 97.05882 11 Contig79:1134 100.0 On Thu, Nov 4, 2010 at 11:46 AM, Changbin Du changb...@gmail.com wrote: HI, Dear R community, I have one data set like this, What I want to do is to calculate the cumulative coverage. The following codes works for small data set (#rows = 100), but when feed the whole data set, it still running after 24 hours. Can someone give some suggestions for long vector? id reads Contig79:1 4 Contig79:2 8 Contig79:3 13 Contig79:4 14 Contig79:5 17 Contig79:6 20 Contig79:7 25 Contig79:8 27 Contig79:9 32 Contig79:10 33 Contig79:11 34 matt-read.table(/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth, sep=\t, skip=0, header=F,fill=T) # dim(matt) [1] 3384766 2 matt_plot-function(matt, outputfile) { names(matt)-c(id,reads) cover-matt$reads #calculate the cumulative coverage. + cover_per-function (data) { + output-numeric(0) + for (i in data) { + x-(100*sum(ifelse(data = i, 1, 0))/length(data)) + output-c(output, x) + } + return(output) + } result-cover_per(cover) Thanks so much! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to work with long vectors
Try this: rev(100 * cumsum(matt$reads 1) / length(matt$reads) ) On Thu, Nov 4, 2010 at 1:46 PM, Changbin Du changb...@gmail.com wrote: HI, Dear R community, I have one data set like this, What I want to do is to calculate the cumulative coverage. The following codes works for small data set (#rows = 100), but when feed the whole data set, it still running after 24 hours. Can someone give some suggestions for long vector? idreads Contig79:14 Contig79:28 Contig79:313 Contig79:414 Contig79:517 Contig79:620 Contig79:725 Contig79:827 Contig79:932 Contig79:1033 Contig79:1134 matt-read.table(/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth, sep=\t, skip=0, header=F,fill=T) # dim(matt) [1] 3384766 2 matt_plot-function(matt, outputfile) { names(matt)-c(id,reads) cover-matt$reads #calculate the cumulative coverage. + cover_per-function (data) { + output-numeric(0) + for (i in data) { + x-(100*sum(ifelse(data = i, 1, 0))/length(data)) + output-c(output, x) + } + return(output) + } result-cover_per(cover) Thanks so much! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting a grid of directly specified colours
Hi, try this, library(grid) grid.raster(matrix(colors(),ncol=50),interp=F) HTH, baptiste On Nov 4, 2010, at 5:00 PM, Peter Davenport wrote: Dear R-help, Could any of you direct me to a function for plotting a grid of colours, directly specified by a matrix of hex colour codes? In other words I'm looking for a heatmap() or image()-like function to which I can specify the colour of each grid location directly, rather than providing a numerical matrix and a 1D-colour scale (heatmap, image, levelplots,NeatMap...). I'm surprised I haven't found anything simple with RSiteSearch, help.search, net. I'd like to use this function to encode one variable as chroma and a second as luminance (hcl colour space), so that the two variables can be visualised in a single heatmap (the variable are fold-change and a q-value, a significance measure). If anyone has any thoughts/warnings to offer re this idea then I'd love to hear them (it must have been tried before, but I've not come across any examples) . Best wishes and thank you, Peter Davenport [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to work with long vectors
Thanks, Jim! This is not what I want, What I want is calculate the percentage of reads bigger or equal to that reads in each position.MY output is like the following: for row 1, all the reads is = 4, so the cover_per is 100, for row 2, 99 % reads =4, so the cover_per is 99. head(final) cover_per reads 1 100 4 299 8 39813 49714 59617 69520 I attached the input file with this email. This file is only 100 rows, very small. MY original data set is 3384766 rows. matt-read.table(/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth, sep=\t, skip=0, header=F,fill=T) # dim(matt) [1] 3384766 2 Thanks so much for your time! matt-read.table(/home/cdu/operon/dimer5_0623/matt_test.txt, sep=\t, skip=0, header=F,fill=T) # names(matt)-c(id,reads) dim(matt) [1] 100 2 cover-matt$reads #calculate the cumulative coverage. cover_per-function (data) { + output-numeric(0) + for (i in data) { + x-(100*sum(ifelse(data = i, 1, 0))/length(data)) + output-c(output, x) + } + return(output) + } result-cover_per(cover) head(result) [1] 100 99 98 97 96 95 final-data.frame(result, cover) names(final)-c(cover_per, reads) head(final) cover_per reads 1 100 4 299 8 39813 49714 59617 69520 On Thu, Nov 4, 2010 at 9:18 AM, jim holtman jholt...@gmail.com wrote: Is this what you want: x id reads 1 Contig79:1 4 2 Contig79:2 8 3 Contig79:313 4 Contig79:414 5 Contig79:517 6 Contig79:620 7 Contig79:725 8 Contig79:827 9 Contig79:932 10 Contig79:1033 11 Contig79:1134 x$percent - x$reads / max(x$reads) * 100 x id reads percent 1 Contig79:1 4 11.76471 2 Contig79:2 8 23.52941 3 Contig79:313 38.23529 4 Contig79:414 41.17647 5 Contig79:517 50.0 6 Contig79:620 58.82353 7 Contig79:725 73.52941 8 Contig79:827 79.41176 9 Contig79:932 94.11765 10 Contig79:1033 97.05882 11 Contig79:1134 100.0 On Thu, Nov 4, 2010 at 11:46 AM, Changbin Du changb...@gmail.com wrote: HI, Dear R community, I have one data set like this, What I want to do is to calculate the cumulative coverage. The following codes works for small data set (#rows = 100), but when feed the whole data set, it still running after 24 hours. Can someone give some suggestions for long vector? idreads Contig79:14 Contig79:28 Contig79:313 Contig79:414 Contig79:517 Contig79:620 Contig79:725 Contig79:827 Contig79:932 Contig79:1033 Contig79:1134 matt-read.table(/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth, sep=\t, skip=0, header=F,fill=T) # dim(matt) [1] 3384766 2 matt_plot-function(matt, outputfile) { names(matt)-c(id,reads) cover-matt$reads #calculate the cumulative coverage. + cover_per-function (data) { + output-numeric(0) + for (i in data) { + x-(100*sum(ifelse(data = i, 1, 0))/length(data)) + output-c(output, x) + } + return(output) + } result-cover_per(cover) Thanks so much! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 Contig79:1 4 Contig79:2 8 Contig79:3 13 Contig79:4 14 Contig79:5 17 Contig79:6 20 Contig79:7 25 Contig79:8 27 Contig79:9 32 Contig79:10 33 Contig79:11 34 Contig79:12 36 Contig79:13 39 Contig79:14 40 Contig79:15 44 Contig79:16 49 Contig79:17 55 Contig79:18 56 Contig79:19 59 Contig79:20 60 Contig79:21 62 Contig79:22 64 Contig79:23 64 Contig79:24 68 Contig79:25 68 Contig79:26 68 Contig79:27 70 Contig79:28 73 Contig79:29 76 Contig79:30 77 Contig79:31 78 Contig79:32 78 Contig79:33 79 Contig79:34 80 Contig79:35 80 Contig79:36 84 Contig79:37 87 Contig79:38 87 Contig79:39 88 Contig79:40 88 Contig79:41 89
Re: [R] Plotting a grid of directly specified colours
On Thu, Nov 4, 2010 at 4:00 PM, Peter Davenport pwdavenp...@gmail.com wrote: Dear R-help, Could any of you direct me to a function for plotting a grid of colours, directly specified by a matrix of hex colour codes? In other words I'm looking for a heatmap() or image()-like function to which I can specify the colour of each grid location directly, rather than providing a numerical matrix and a 1D-colour scale (heatmap, image, levelplots,NeatMap...). I'm surprised I haven't found anything simple with RSiteSearch, help.search, net. I'd like to use this function to encode one variable as chroma and a second as luminance (hcl colour space), so that the two variables can be visualised in a single heatmap (the variable are fold-change and a q-value, a significance measure). If anyone has any thoughts/warnings to offer re this idea then I'd love to hear them (it must have been tried before, but I've not come across any examples) . I've kludged this kind of thing in the past. Create a matrix of 1:(nrow*ncol), and specify the col as your colour matrix. Example: m = matrix(c(red,green,blue,yellow,orange,black),2,3) mc = matrix(1:(nrow(m)*ncol(m)),nrow(m),ncol(m)) image(mc,col=m) Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to work with long vectors
HI, Henrique, Thanks for the great help! I compared the output from your codes: te-rev(100 * cumsum(matt$reads 1) / length(matt$reads) ) te [1] 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 [19] 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 [37] 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 [55] 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 [73] 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 [91] 10 9 8 7 6 5 4 3 2 1 the output from my code, result [1] 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 [19] 82 81 80 79 79 77 77 77 74 73 72 71 70 70 68 67 67 65 [37] 64 64 62 62 60 59 58 57 56 56 54 53 52 51 51 49 48 47 [55] 46 45 45 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 [73] 28 27 27 27 24 24 22 21 20 19 19 19 19 15 14 14 12 11 [91] 10 9 8 7 7 5 4 3 2 1 There is no tie in your output. Look at the data set: There are ties in the data set. Your codes work fast, but I think the results is not accurate. Thanks so much for the great help! matt[c(1:35), ] id reads 1 Contig79:1 4 2 Contig79:2 8 ; ; 22 Contig79:2264 23 Contig79:2364 24 Contig79:2468 25 Contig79:2568 26 Contig79:2668 I also attached the testing file with this email. Thanks! On Thu, Nov 4, 2010 at 9:12 AM, Henrique Dallazuanna www...@gmail.comwrote: Try this: rev(100 * cumsum(matt$reads 1) / length(matt$reads) ) On Thu, Nov 4, 2010 at 1:46 PM, Changbin Du changb...@gmail.com wrote: HI, Dear R community, I have one data set like this, What I want to do is to calculate the cumulative coverage. The following codes works for small data set (#rows = 100), but when feed the whole data set, it still running after 24 hours. Can someone give some suggestions for long vector? idreads Contig79:14 Contig79:28 Contig79:313 Contig79:414 Contig79:517 Contig79:620 Contig79:725 Contig79:827 Contig79:932 Contig79:1033 Contig79:1134 matt-read.table(/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth, sep=\t, skip=0, header=F,fill=T) # dim(matt) [1] 3384766 2 matt_plot-function(matt, outputfile) { names(matt)-c(id,reads) cover-matt$reads #calculate the cumulative coverage. + cover_per-function (data) { + output-numeric(0) + for (i in data) { + x-(100*sum(ifelse(data = i, 1, 0))/length(data)) + output-c(output, x) + } + return(output) + } result-cover_per(cover) Thanks so much! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 Contig79:1 4 Contig79:2 8 Contig79:3 13 Contig79:4 14 Contig79:5 17 Contig79:6 20 Contig79:7 25 Contig79:8 27 Contig79:9 32 Contig79:10 33 Contig79:11 34 Contig79:12 36 Contig79:13 39 Contig79:14 40 Contig79:15 44 Contig79:16 49 Contig79:17 55 Contig79:18 56 Contig79:19 59 Contig79:20 60 Contig79:21 62 Contig79:22 64 Contig79:23 64 Contig79:24 68 Contig79:25 68 Contig79:26 68 Contig79:27 70 Contig79:28 73 Contig79:29 76 Contig79:30 77 Contig79:31 78 Contig79:32 78 Contig79:33 79 Contig79:34 80 Contig79:35 80 Contig79:36 84 Contig79:37 87 Contig79:38 87 Contig79:39 88 Contig79:40 88 Contig79:41 89 Contig79:42 93 Contig79:43 94 Contig79:44 98 Contig79:45 99 Contig79:46 99 Contig79:47 102 Contig79:48 103 Contig79:49 108 Contig79:50 112 Contig79:51 112 Contig79:52 113 Contig79:53 116 Contig79:54 118 Contig79:55 120 Contig79:56 124 Contig79:57 124 Contig79:58 126 Contig79:59 128 Contig79:60 130 Contig79:61 133 Contig79:62 134 Contig79:63 136 Contig79:64 139 Contig79:65 144 Contig79:66 145 Contig79:67 146 Contig79:68 148 Contig79:69 149 Contig79:70 151 Contig79:71 156 Contig79:72 157 Contig79:73 158 Contig79:74 159 Contig79:75 159 Contig79:76 159 Contig79:77 160 Contig79:78 160 Contig79:79 161 Contig79:80 163
Re: [R] how to work with long vectors
On 11/04/2010 09:45 AM, Changbin Du wrote: Thanks, Jim! This is not what I want, What I want is calculate the percentage of reads bigger or equal to that reads in each position.MY output is like the following: Hi Changbin -- I might be repeating myself, but the Bioconductor packages IRanges and GenomicRanges are designed to work with this sort of data, and include 'coverage' functions that do what you're interested in. Look into ?GRanges if interested. http://bioconductor.org/help/bioc-views/release/BiocViews.html#___HighThroughputSequencing Martin for row 1, all the reads is = 4, so the cover_per is 100, for row 2, 99 % reads =4, so the cover_per is 99. head(final) cover_per reads 1 100 4 299 8 39813 49714 59617 69520 I attached the input file with this email. This file is only 100 rows, very small. MY original data set is 3384766 rows. matt-read.table(/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth, sep=\t, skip=0, header=F,fill=T) # dim(matt) [1] 3384766 2 Thanks so much for your time! matt-read.table(/home/cdu/operon/dimer5_0623/matt_test.txt, sep=\t, skip=0, header=F,fill=T) # names(matt)-c(id,reads) dim(matt) [1] 100 2 cover-matt$reads #calculate the cumulative coverage. cover_per-function (data) { + output-numeric(0) + for (i in data) { + x-(100*sum(ifelse(data = i, 1, 0))/length(data)) + output-c(output, x) + } + return(output) + } result-cover_per(cover) head(result) [1] 100 99 98 97 96 95 final-data.frame(result, cover) names(final)-c(cover_per, reads) head(final) cover_per reads 1 100 4 299 8 39813 49714 59617 69520 On Thu, Nov 4, 2010 at 9:18 AM, jim holtman jholt...@gmail.com wrote: Is this what you want: x id reads 1 Contig79:1 4 2 Contig79:2 8 3 Contig79:313 4 Contig79:414 5 Contig79:517 6 Contig79:620 7 Contig79:725 8 Contig79:827 9 Contig79:932 10 Contig79:1033 11 Contig79:1134 x$percent - x$reads / max(x$reads) * 100 x id reads percent 1 Contig79:1 4 11.76471 2 Contig79:2 8 23.52941 3 Contig79:313 38.23529 4 Contig79:414 41.17647 5 Contig79:517 50.0 6 Contig79:620 58.82353 7 Contig79:725 73.52941 8 Contig79:827 79.41176 9 Contig79:932 94.11765 10 Contig79:1033 97.05882 11 Contig79:1134 100.0 On Thu, Nov 4, 2010 at 11:46 AM, Changbin Du changb...@gmail.com wrote: HI, Dear R community, I have one data set like this, What I want to do is to calculate the cumulative coverage. The following codes works for small data set (#rows = 100), but when feed the whole data set, it still running after 24 hours. Can someone give some suggestions for long vector? idreads Contig79:14 Contig79:28 Contig79:313 Contig79:414 Contig79:517 Contig79:620 Contig79:725 Contig79:827 Contig79:932 Contig79:1033 Contig79:1134 matt-read.table(/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth, sep=\t, skip=0, header=F,fill=T) # dim(matt) [1] 3384766 2 matt_plot-function(matt, outputfile) { names(matt)-c(id,reads) cover-matt$reads #calculate the cumulative coverage. + cover_per-function (data) { + output-numeric(0) + for (i in data) { + x-(100*sum(ifelse(data = i, 1, 0))/length(data)) + output-c(output, x) + } + return(output) + } result-cover_per(cover) Thanks so much! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 __ R-help@r-project.org mailing list
Re: [R] how to work with long vectors
Thanks Martin, I will try this. On Thu, Nov 4, 2010 at 10:06 AM, Martin Morgan mtmor...@fhcrc.org wrote: On 11/04/2010 09:45 AM, Changbin Du wrote: Thanks, Jim! This is not what I want, What I want is calculate the percentage of reads bigger or equal to that reads in each position.MY output is like the following: Hi Changbin -- I might be repeating myself, but the Bioconductor packages IRanges and GenomicRanges are designed to work with this sort of data, and include 'coverage' functions that do what you're interested in. Look into ?GRanges if interested. http://bioconductor.org/help/bioc-views/release/BiocViews.html#___HighThroughputSequencing Martin for row 1, all the reads is = 4, so the cover_per is 100, for row 2, 99 % reads =4, so the cover_per is 99. head(final) cover_per reads 1 100 4 299 8 39813 49714 59617 69520 I attached the input file with this email. This file is only 100 rows, very small. MY original data set is 3384766 rows. matt-read.table(/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth, sep=\t, skip=0, header=F,fill=T) # dim(matt) [1] 3384766 2 Thanks so much for your time! matt-read.table(/home/cdu/operon/dimer5_0623/matt_test.txt, sep=\t, skip=0, header=F,fill=T) # names(matt)-c(id,reads) dim(matt) [1] 100 2 cover-matt$reads #calculate the cumulative coverage. cover_per-function (data) { + output-numeric(0) + for (i in data) { + x-(100*sum(ifelse(data = i, 1, 0))/length(data)) + output-c(output, x) + } + return(output) + } result-cover_per(cover) head(result) [1] 100 99 98 97 96 95 final-data.frame(result, cover) names(final)-c(cover_per, reads) head(final) cover_per reads 1 100 4 299 8 39813 49714 59617 69520 On Thu, Nov 4, 2010 at 9:18 AM, jim holtman jholt...@gmail.com wrote: Is this what you want: x id reads 1 Contig79:1 4 2 Contig79:2 8 3 Contig79:313 4 Contig79:414 5 Contig79:517 6 Contig79:620 7 Contig79:725 8 Contig79:827 9 Contig79:932 10 Contig79:1033 11 Contig79:1134 x$percent - x$reads / max(x$reads) * 100 x id reads percent 1 Contig79:1 4 11.76471 2 Contig79:2 8 23.52941 3 Contig79:313 38.23529 4 Contig79:414 41.17647 5 Contig79:517 50.0 6 Contig79:620 58.82353 7 Contig79:725 73.52941 8 Contig79:827 79.41176 9 Contig79:932 94.11765 10 Contig79:1033 97.05882 11 Contig79:1134 100.0 On Thu, Nov 4, 2010 at 11:46 AM, Changbin Du changb...@gmail.com wrote: HI, Dear R community, I have one data set like this, What I want to do is to calculate the cumulative coverage. The following codes works for small data set (#rows = 100), but when feed the whole data set, it still running after 24 hours. Can someone give some suggestions for long vector? idreads Contig79:14 Contig79:28 Contig79:313 Contig79:414 Contig79:517 Contig79:620 Contig79:725 Contig79:827 Contig79:932 Contig79:1033 Contig79:1134 matt-read.table(/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth, sep=\t, skip=0, header=F,fill=T) # dim(matt) [1] 3384766 2 matt_plot-function(matt, outputfile) { names(matt)-c(id,reads) cover-matt$reads #calculate the cumulative coverage. + cover_per-function (data) { + output-numeric(0) + for (i in data) { + x-(100*sum(ifelse(data = i, 1, 0))/length(data)) + output-c(output, x) + } + return(output) + } result-cover_per(cover) Thanks so much! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal,
Re: [R] how to work with long vectors
Changbin - Does 100 * sapply(matt$reads,function(x)sum(matt$reads = x))/length(matt$reads) give what you want? By the way, if you want to use a loop (there's nothing wrong with that), then try to avoid the most common mistake that people make with loops in R: having your result grow inside the loop. Here's a better way to use a loop to solve your problem: cover_per_1 - function(data){ l = length(data) output = numeric(l) for(i in 1:l)output[i] = 100 * sum(ifelse(data = data[i], 1, 0))/length(data) output } Using some random data, and comparing to your original cover_per function: dat = rnorm(1000) system.time(one - cover_per(dat)) user system elapsed 0.816 0.000 0.824 system.time(two - cover_per_1(dat)) user system elapsed 0.792 0.000 0.805 Not that big a speedup, but it does increase quite a bit as the problem gets larger. There are two obvious ways to speed up your function: 1) Eliminate the ifelse function, since automatic coersion from logical to numeric does the same thing. 2) Multiply by 100 and divide by the length outside the loop: cover_per_2 - function(data){ l = length(data) output = numeric(l) for(i in 1:l)output[i] = sum(data = data[i]) 100 * output / l } system.time(three - cover_per_2(dat)) user system elapsed 0.024 0.000 0.027 That makes the loop just about equivalent to the sapply solution: system.time(four - 100*sapply(dat,function(x)sum(dat = x))/length(dat)) user system elapsed 0.024 0.000 0.026 - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Thu, 4 Nov 2010, Changbin Du wrote: HI, Dear R community, I have one data set like this, What I want to do is to calculate the cumulative coverage. The following codes works for small data set (#rows = 100), but when feed the whole data set, it still running after 24 hours. Can someone give some suggestions for long vector? idreads Contig79:14 Contig79:28 Contig79:313 Contig79:414 Contig79:517 Contig79:620 Contig79:725 Contig79:827 Contig79:932 Contig79:1033 Contig79:1134 matt-read.table(/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth, sep=\t, skip=0, header=F,fill=T) # dim(matt) [1] 3384766 2 matt_plot-function(matt, outputfile) { names(matt)-c(id,reads) cover-matt$reads #calculate the cumulative coverage. + cover_per-function (data) { + output-numeric(0) + for (i in data) { + x-(100*sum(ifelse(data = i, 1, 0))/length(data)) + output-c(output, x) + } + return(output) + } result-cover_per(cover) Thanks so much! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to work with long vectors
Thanks Phil, that is great! I WILL try this and let you know how it goes. On Thu, Nov 4, 2010 at 10:16 AM, Phil Spector spec...@stat.berkeley.eduwrote: Changbin - Does 100 * sapply(matt$reads,function(x)sum(matt$reads = x))/length(matt$reads) give what you want? By the way, if you want to use a loop (there's nothing wrong with that), then try to avoid the most common mistake that people make with loops in R: having your result grow inside the loop. Here's a better way to use a loop to solve your problem: cover_per_1 - function(data){ l = length(data) output = numeric(l) for(i in 1:l)output[i] = 100 * sum(ifelse(data = data[i], 1, 0))/length(data) output } Using some random data, and comparing to your original cover_per function: dat = rnorm(1000) system.time(one - cover_per(dat)) user system elapsed 0.816 0.000 0.824 system.time(two - cover_per_1(dat)) user system elapsed 0.792 0.000 0.805 Not that big a speedup, but it does increase quite a bit as the problem gets larger. There are two obvious ways to speed up your function: 1) Eliminate the ifelse function, since automatic coersion from logical to numeric does the same thing. 2) Multiply by 100 and divide by the length outside the loop: cover_per_2 - function(data){ l = length(data) output = numeric(l) for(i in 1:l)output[i] = sum(data = data[i]) 100 * output / l } system.time(three - cover_per_2(dat)) user system elapsed 0.024 0.000 0.027 That makes the loop just about equivalent to the sapply solution: system.time(four - 100*sapply(dat,function(x)sum(dat = x))/length(dat)) user system elapsed 0.024 0.000 0.026 - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Thu, 4 Nov 2010, Changbin Du wrote: HI, Dear R community, I have one data set like this, What I want to do is to calculate the cumulative coverage. The following codes works for small data set (#rows = 100), but when feed the whole data set, it still running after 24 hours. Can someone give some suggestions for long vector? idreads Contig79:14 Contig79:28 Contig79:313 Contig79:414 Contig79:517 Contig79:620 Contig79:725 Contig79:827 Contig79:932 Contig79:1033 Contig79:1134 matt-read.table(/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth, sep=\t, skip=0, header=F,fill=T) # dim(matt) [1] 3384766 2 matt_plot-function(matt, outputfile) { names(matt)-c(id,reads) cover-matt$reads #calculate the cumulative coverage. + cover_per-function (data) { + output-numeric(0) + for (i in data) { + x-(100*sum(ifelse(data = i, 1, 0))/length(data)) + output-c(output, x) + } + return(output) + } result-cover_per(cover) Thanks so much! -- Sincerely, Changbin -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] candisc plot subset of all groups
Hello, I'm doing CDA, and want to view a plot one group at a time. How can I plot just one group? Example: iris.mod - lm(cbind(Petal.Length, Sepal.Length, Petal.Width, Sepal.Width) ~ Species, data=iris) iris.can - candisc(iris.mod, data=iris) plot(iris.can) In this example the plot shows all three groups - how do I get it to only show one group? Thanks! -- View this message in context: http://r.789695.n4.nabble.com/candisc-plot-subset-of-all-groups-tp3027532p3027532.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ANOVA table and lmer
The following output results from fitting models using lmer and lm to data arising from a split-plot experiment (#320 from Small Data Sets by Hand et al. 1994). The data is given at the bottom of this message. My question is why is the sum of squares for variety (V) different in the ANOVA table generated from the lmer model fit from that generated by the lm model fit. The decomposition of the sum of squares should be the same regardless of whether block is treated as random of fixed. Or am I misinterpreting the ANOVA table from the lmer fit? I noticed that other people have asked similar questions in the past, but I haven't seen a satisfactory explanation. Jim Booth. B=factor(block) V=factor(variety) N=factor(nitrogen) Y=yield lmm.split=lmer(Y~V+N+V:N+(1|B)+(1|B:V)+(1|B:N)) anova(lmm.split) Analysis of Variance Table Df Sum Sq Mean Sq F value V2 526.1 263.0 1.4853 N3 20020.5 6673.5 37.6856 V:N 6 321.853.6 0.3028 lm.split=lm(Y~B*V+B*N+V*N) anova(lm.split) Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F valuePr(F) B 5 15875.3 3175.1 15.4114 1.609e-07 *** V 2 1786.4 893.2 4.3354 0.02219 * N 3 20020.5 6673.5 32.3926 1.540e-09 *** B:V 10 6013.3 601.3 2.9188 0.01123 * B:N 15 1788.2 119.2 0.5786 0.86816 V:N6 321.753.6 0.2603 0.95103 Residuals 30 6180.6 206.0 split block variety nitrogen yield 1 1 10 111 2 1 11 130 3 1 12 157 4 1 14 174 5 1 20 117 6 1 21 114 7 1 22 161 8 1 24 141 9 1 30 105 10 1 31 140 11 1 32 118 12 1 34 156 13 2 1061 14 2 1191 15 2 1297 16 2 14 100 17 2 2070 18 2 21 108 19 2 22 126 20 2 24 149 21 2 3096 22 2 31 124 23 2 32 121 24 2 34 144 25 3 1068 26 3 1164 27 3 12 112 28 3 1486 29 3 2060 30 3 21 102 31 3 2289 32 3 2496 33 3 3089 34 3 31 129 35 3 32 132 36 3 34 124 37 4 1074 38 4 1189 39 4 1281 40 4 14 122 41 4 2064 42 4 21 103 43 4 22 132 44 4 24 133 45 4 3070 46 4 3189 47 4 32 104 48 4 34 117 49 5 1062 50 5 1190 51 5 12 100 52 5 14 116 53 5 2080 54 5 2182 55 5 2294 56 5 24 126 57 5 3063 58 5 3170 59 5 32 109 60 5 3499 61 6 1053 62 6 1174 63 6 12 118 64 6 14 113 65 6 2089 66 6 2182 67 6 2286 68 6 24 104 69 6 3097 70 6 3199 71 6 32 119 72 6 34 121 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] map on irregular grids
Hi all how to find a function for plotting polygon surface, like polgon3d(xc,yc,obs) xc, yc ... coordinates obs observations result: persp plot with grid net over the coordinates W.Polasek [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] matlab code into R
Hello, I'm trying to write the following matlab code into R: N = zeros(n-1); for i=2:(n-1) N(1,i) = 1/(pi * (i-1)); end for i=2:(n-2) for j=i:(n-1) N(i,j) = N(i-1,j-1); end; end for i=2:(n-1) end for j=1:i N(i,j) = -N(j,i); end; any suggestions? Thanks can i just add the following line to my calculation N=1/(pi*(i-1) -- Marcelo Andrade de Lima UNIFESP - Universidade Federal de São Paulo Departamento de Bioquímica Disciplina de Biologia Molecular Rua Três de Maio 100, 4 andar - Vila Clementino, 04044-020 Lab +55 11 55764438 R.1188 Cell +55 11 92725274 ml...@unifesp.br [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plotting a vector data
Hi; I have 30 data sets and I managed to take the average of a variable in each set and put them in a vector like variable(It contains NaN data as well). x- matrix( list.files(C:/updated_CFL_Rad_files/2007/11,full=TRUE)) for(i in 1:30) { radiation.data -read.table(x[i], header = TRUE,sep = ,, quote = , dec = .) attach(radiation.data) names(radiation.data) mean.radiation[i]- mean(PAR_avg,na.rm = TRUE) } How can I plot this vector (mean.radiation[i]) vs i ? I tried to do so but there was an error: Error in plot.window(...) : need finite 'ylim' values In addition: Warning messages: 1: In min(x) : no non-missing arguments to min; returning Inf 2: In max(x) : no non-missing arguments to max; returning -Inf -- Sincerely Nasrin Pak [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot output
Hi: This isn't very difficult if you use a little imagination. We want three separate plots of monthly means by variable with attached error bars. This requires faceting, so we need to create a factor whose levels are the variable names. We also need to generate enough data to summarize by mean and standard deviation per month in order to generate the point/line/error bar plots. Here's one attempt. # Generate fake data: if you don't have an example, make one up! :) # 20 obs. per month, simulate from normal distribution with a # given vector of monthly means and standard deviations ## Set up time variable, monthly means and SDs ## # I'm lazy: I just used the three letter month abbreviations in month.abb() # to use as my 'time' variable Month - factor(rep(month.abb, each = 20), levels = month.abb) # monthly means mmeans - c(12, 15, 18, 20, 24, 30, 26, 23, 18, 15, 12, 10) # monthly sd's msd - rep(c(3, 5, 4, 3), each = 3) # Generate data ### # Simulate 240 observations for each of the three variables C - rnorm(240, m = rep(mmeans, each = 20), s = rep(msd, each = 20)) K - rnorm(240, m = rep(mmeans+2, each = 20), s = rep(msd+1, each = 20)) S - rnorm(240, m = rep(mmeans-2, each = 20), s = rep(msd+2, each = 20)) # Combine into a data frame d - data.frame(Month, C, K, S) library(ggplot2) # plyr and reshape get loaded with ggplot2 Process data: get monthly means/SDs for each variable ## # Use ddply to get monthly means and sd's for each variable # (Yes, there are more efficient ways to do this, but there are only three...) md - ddply(d, .(Month), summarise, C_avg = mean(C), C_stdev = sd(C), K_avg = mean(K), K_stdev = sd(K), S_avg = mean(S), S_stdev = sd(S)) # Melt the data from 'wide' to 'long' - the idea is to stack C, K and S values # and use their names as a factor variable. This is a very useful trick for # faceting or grouping. # grep() is used to select variable names that end in 'avg' or 'stdev'; the $ # sign in a regular expression indicates that action. # Thanks to Kohske Takahashi for the clue to melting multiple groups of variables # in a post on the ggplot2 list. dmelt - data.frame( melt(md, id = 'Month', measure = c(grep('avg$', names(md, sd = melt(md, id = 'Month', measure = c(grep('stdev$', names(md$value ) # Some housecleaning: change value to Mean in names and create a new variable that # only uses the variable name (C, K, S) as a factor level, to be used for labeling # the facets. This reduces the amount of ggplot() code we need to write. names(dmelt)[3] - 'Mean' dmelt$Variable - substring(dmelt$variable, 1, 1) # Should be straightforward - scale code is used to avoid overlapping labels in a # confined graphics space. g - ggplot(dmelt, aes(x = Month, y = Mean)) g + geom_point() + geom_line(aes(group = 1)) + geom_errorbar(aes(ymin = Mean - sd, ymax = Mean + sd), width = 0.4) + facet_wrap(~ Variable, nrow = 1) + scale_x_discrete(breaks = levels(Month), labels = substring(month.abb, 1, 1)) # Just for the heck of it, here's a monthly plot of each variable's means (no error bars) as an alternative: g + geom_point(aes(colour = Variable), size = 3) + geom_line(aes(colour = Variable, group = Variable), size = 1) Notice that by limiting the aesthetics in g, I was able to insert additional aesthetics ymin and xmin into geom_errorbar() in the first plot and colour + group in the second plot and still use the same g as a foundation. HTH, Dennis On Thu, Nov 4, 2010 at 7:32 AM, ashz a...@walla.co.il wrote: Dear Thierry, Your solution looks very elgant but I can not find a proper example. Can you provide me one? Thx -- View this message in context: http://r.789695.n4.nabble.com/ggplot-output-tp3027026p3027108.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Echo to file using Rscript
I use R 2.12.0 in Windows XP. For debugging and control I am trying to get a file with contains the echo when code is copied into the R Gui. This works e.g. with the command: R CMD BATCH --no-restore D:\path\script.r then a file called script.Rout is generated in the same folder. It contains the code and the corresponding output. This does not work using: Rscript --no-restore D:\Ketzin\Invers\V1\read_obs.r A partial alternative is to include to the beginning of the code sink(log.dat,type=c(output,message)) but the file does not contain the code, only the output. Is there a way to geneate anything analogue to the *.Rout file? if possible only with command line, without the sink() command? Best, Bernd __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matlab code into R
Well, there's the obvious: N = matrix(0,n-1,n-1) for(i in 2:(n-1)) N[1,i] = 1/(pi * (i-1)) for(i in 2:(n-2)) for(j in i:(n-1)) N[i,j] = N[i-1,j-1] for(i in 2:(n-1)) for(j in 1:i) N[i,j] = -N[j,i] - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu Hello, I'm trying to write the following matlab code into R: N = zeros(n-1); for i=2:(n-1) N(1,i) = 1/(pi * (i-1)); end for i=2:(n-2) for j=i:(n-1) N(i,j) = N(i-1,j-1); end; end for i=2:(n-1) end for j=1:i N(i,j) = -N(j,i); end; any suggestions? Thanks can i just add the following line to my calculation N=1/(pi*(i-1) -- Marcelo Andrade de Lima UNIFESP - Universidade Federal de S?o Paulo Departamento de Bioqu?mica Disciplina de Biologia Molecular Rua Tr?s de Maio 100, 4 andar - Vila Clementino, 04044-020 Lab +55 11 55764438 R.1188 Cell +55 11 92725274 ml...@unifesp.br [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotting time series for particular months!
Hi all, I have a matrix as given below... year month day prec 1 1980 10 1 13.4 2 1980 10 2 0.0 3 1980 10 3 0.0 4 1980 10 4 0.0 5 1980 10 5 0.0 6 1980 10 6 0.0 7 1980 10 7 0.0 8 1980 10 8 0.0 9 1980 10 9 0.0 10 1980 10 10 0.0 11 1980 10 11 7.4 12 1980 10 12 5.4 13 1980 10 13 7.2 14 1980 10 14 0.0 15 1980 10 15 0.0 16 1980 10 16 0.0 17 1980 10 17 41.2 18 1980 10 18 0.0 19 1980 10 19 0.0 20 1980 10 20 0.0 21 1980 10 21 0.0 22 1980 10 22 0.0 23 1980 10 23 0.0 24 1980 10 24 0.0 25 1980 10 25 0.0 26 1980 10 26 0.0 27 1980 10 27 2.0 28 1980 10 28 0.0 29 1980 10 29 0.0 30 1980 10 30 0.0 31 1980 10 31 0.0 32 1980 11 1 0.0 33 1980 11 2 0.0 34 1980 11 3 0.0 35 1980 11 4 0.0 36 1980 11 5 12.4 the precipitation values extend from 1980 to 2005, but only for october, november and december. I would like to plot just these 3 months for the given time period (1980 - 2005). Is there a way to get these values in the x-axis. i.e. i need a plot with its axis reading 1980, 1981, 1982 .. 2005 (even if the months are specified then it should be of more use). for now, i get an axis like .. 0, 500, ... 2000 (the plot is giving the index values). i tried changing the freq as given below, but did not work! ts.chn - ts(chn.arr[1:2386,4], start=c(1980, 10), end=c(2005, 12), freq=365) plot(ts.chn) -- Regards, Mahalakshmi Graduate Student #20, Department of Geography Michigan State University East Lansing, MI 48824 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to do bootstrap for the complex sample design?
On Fri, Nov 5, 2010 at 3:51 AM, Tim Hesterberg timhesterb...@gmail.com wrote: Faye wrote: Our survey is structured as : To be investigated area is divided into 6 regions, within each region, one urban community and one rural community are randomly selected, then samples are randomly drawn from each selected uran and rural community. The problems is that in urban/rural stratum, we only have one sample. In this case, how to do bootstrap? You are lucky that your sample size is 1. If it were 2 you would probably have proceeded without realizing that the answers were wrong. Suppose you had two samples in each stratum. If you proceed naturally, drawing bootstrap samples of size 2 from each stratum, this would underestimate variability by a factor of 2. In general the ordinary nonparametric bootstrap estimates of variability are biased downward by a factor of (n-1)/n -- exactly for the mean, approximately for other statistics. In multiple-sample and stratified situations, the bias depends on the stratum sizes. Three remedies are: * draw bootstrap samples of size n-1 * bootknife sampling - omit one observation (a jackknife sample), then draw a bootstrap sample of size n from that * bootstrap from a kernel density estimate, with kernel covariance equal to empirical covariance (with divisor n-1) / n. The latter two are described in Hesterberg, Tim C. (2004), Unbiasing the Bootstrap-Bootknife Sampling vs. Smoothing, Proceedings of the Section on Statistics and the Environment, American Statistical Association, 2924-2930. http://home.comcast.net/~timhesterberg/articles/JSM04-bootknife.pdf All three are undefined for samples of size 1. You need to go to some other bootstrap, e.g. a parametric bootstrap with variability estimated from other data. And the 'survey' package supplies the first option. (It also supplies a bootstrap sample of size n that allows finite population corrections, designed for situations with a large n and a high sampling fraction, such as some business surveys.) With a sample size of 1 per stratum there are no design-unbiased estimators of the standard error, so as others have said you need external data. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glmnet_1.5 uploaded to CRAN
This is a new version of glmnet, that incorporates some bug fixes and speedups. * a new convergence criterion which which offers 10x or more speedups for saturated fits (mainly effects logistic, Poisson and Cox) * one can now predict directly from a cv.object - see the help files for cv.glmnet and predict.cv.glmnet * other new methods are deviance() for glmnet and coef() for cv.glmnet Here is the description of the package. glmnet is a package that fits the regularization path for linear, two- and multi-class logistic regression models, poisson regression and the Cox model, with elastic net regularization (tunable mixture of L1 and L2 penalties). glmnet uses pathwise coordinate descent, and is very fast. Some of the features of glmnet: * by default it computes the path at 100 uniformly spaced (on the log scale) values of the regularization parameter * glmnet is very fast, even for large data sets. * recognizes and exploits sparse input matrices (ala Matrix package). Coefficient matrices are output in sparse matrix representation. * penalty is (1-a)*||\beta||_2^2 +a*||beta||_1 where a is between 0 and 1; a=0 is the Lasso penalty, a=1 is the ridge penalty. For many correlated predictors, a=.95 or thereabouts improves the performance of the lasso. * convenient predict, plot, print, and coef methods * variable-wise penalty modulation allows each variable to be penalized by a scalable amount; if zero that variable always enters * glmnet uses a symmetric parametrization for multinomial, with constraints enforced by the penalization. * a comprehensive set of cross-validation routines are provided for all models and several error measures * offsets and weights can be provided for all models Examples of glmnet speed trials: Newsgroup data: N=11,000, p= 0.75 Million, two class logistic. 100 values along lasso path. Time = 2mins 14 Class cancer data: N=144, p=16K, 14 class multinomial, 100 values along lasso path. Time = 30secs Authors: Jerome Friedman, Trevor Hastie, Rob Tibshirani. See our paper http://www-stat.stanford.edu/~hastie/Papers/glmnet.pdf for implementation details, and comparisons with other related software. --- Trevor Hastie has...@stanford.edu Professor, Department of Statistics, Stanford University Phone: (650) 725-2231 (Statistics) Fax: (650) 725-8977 (650) 498-5233 (Biostatistics) Fax: (650) 725-6951 URL: http://www-stat.stanford.edu/~hastie address: room 104, Department of Statistics, Sequoia Hall 390 Serra Mall, Stanford University, CA 94305-4065 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANOVA table and lmer
Hi Jim, The decomposition of the sum of squares should be the same regardless of whether block is treated as random of fixed. Should it? By whose reckoning? The models you are comparing are different. Simple consideration of the terms listed in the (standard) ANOVA output shows that this is so, so how could the sum-of-squares be the same? I noticed that other people have asked similar questions in the past, but I haven't seen a satisfactory explanation. Maybe, but it has been answered (by me, and surely by others). However, canonical would be Venables and Ripley's MASS (: 283--286). The models you need to compare are the following: ## Aov.mod - aov(Y ~ V * N + Error(B/V/N), data = oats) Lme.mod - lme(Y ~ V * N, random = ~1 | B/V/N, data = oats) Lmer.mod - lmer(Y~ V * N +(1|B)+(1|B:V)+(1|B:N), data = oats) summary(Aov.mod) anova(Lme.mod) anova(Lmer.mod) HTH, Mark Difford. -- View this message in context: http://r.789695.n4.nabble.com/ANOVA-table-and-lmer-tp3027546p3027662.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] avoid a loop
Let's suppose I have userids and associated attributes... columns a and b a - c(1,1,1,2,2,3,3,3,3) b - c(a,b,c,a,d,a, b, e, f) so a unique list of a would be id - unique(a) I want a matrix like this... [,1] [,2] [,3] [1,]312 [2,]121 [3,]214 Where element i,j is the number of items in b that id[i] and id[j] share... So for example, in element [1,3] of the result matrix, I want to see 2. That is, id's 1 and 3 share two common elements in b, namely a and b. This is hard to articulate, so sorry for the terrible description here. The way I have solved it is to do a double loop, looping over every member of the id column and comparing it to every other member of id to see how many elements of b they share. This takes forever. Thanks cn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoid a loop
Here's one possibility: library(ecodist) a - c(1,1,1,2,2,3,3,3,3) b - c(a,b,c,a,d,a, b, e, f) x - crosstab(a, b, rep(1, length(a))) x a b c d e f 1 1 1 1 0 0 0 2 1 0 0 1 0 0 3 1 1 0 0 1 1 x %*% t(x) 1 2 3 1 3 1 2 2 1 2 1 3 2 1 4 Sarah On Thu, Nov 4, 2010 at 3:42 PM, cory n corynis...@gmail.com wrote: Let's suppose I have userids and associated attributes... columns a and b a - c(1,1,1,2,2,3,3,3,3) b - c(a,b,c,a,d,a, b, e, f) so a unique list of a would be id - unique(a) I want a matrix like this... [,1] [,2] [,3] [1,] 3 1 2 [2,] 1 2 1 [3,] 2 1 4 Where element i,j is the number of items in b that id[i] and id[j] share... So for example, in element [1,3] of the result matrix, I want to see 2. That is, id's 1 and 3 share two common elements in b, namely a and b. This is hard to articulate, so sorry for the terrible description here. The way I have solved it is to do a double loop, looping over every member of the id column and comparing it to every other member of id to see how many elements of b they share. This takes forever. Thanks cn -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoid a loop
On Nov 4, 2010, at 4:24 PM, Sarah Goslee wrote: Here's one possibility: library(ecodist) a - c(1,1,1,2,2,3,3,3,3) b - c(a,b,c,a,d,a, b, e, f) x - crosstab(a, b, rep(1, length(a))) x a b c d e f 1 1 1 1 0 0 0 2 1 0 0 1 0 0 3 1 1 0 0 1 1 x %*% t(x) 1 2 3 1 3 1 2 2 1 2 1 3 2 1 4 Antoher way: sapply(1:3, function(y) { sapply(1:3, function(x){ length(intersect(b[a==y], b[a==x]) ) } ) } ) [,1] [,2] [,3] [1,]312 [2,]121 [3,]214 Sarah On Thu, Nov 4, 2010 at 3:42 PM, cory n corynis...@gmail.com wrote: Let's suppose I have userids and associated attributes... columns a and b a - c(1,1,1,2,2,3,3,3,3) b - c(a,b,c,a,d,a, b, e, f) so a unique list of a would be id - unique(a) I want a matrix like this... [,1] [,2] [,3] [1,]312 [2,]121 [3,]214 Where element i,j is the number of items in b that id[i] and id[j] share... So for example, in element [1,3] of the result matrix, I want to see 2. That is, id's 1 and 3 share two common elements in b, namely a and b. This is hard to articulate, so sorry for the terrible description here. The way I have solved it is to do a double loop, looping over every member of the id column and comparing it to every other member of id to see how many elements of b they share. This takes forever. Thanks cn -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop
Hi again, Stil don't qute get it... Here's what i did : mat-read.csv(litologija.csv, dec=., sep=;) apply(mat, 2, function(x) head(sort(table(x),decreasing=T),10)) With that i get a table(list/matrix...) which gives the highest count of occurances of each value in a table (at least i think so) But the problem is because it does not tell which value occurs the most (has the highest count). If written like this : apply(mat, 2, function(x) sort(table(x),decreasing=T)) I get decreasingly sorted values of counts of occurances of a specific field and the value of that field for each column: $W2 x PEŠČEN GRADUIRAN INPROD DO GLINAST PROD,PREPEREL MELJAST GRUŠČ GLINA ZMALO GRANULIRANA 18721542 552 519 458 214 175 174 132 114 62 53 47 45 ZELO PEŠČENA ZAGLINJEN KARBONATNI SKRILAVCA, S SKRILAVCA GRANULIRAN PEČŠEN VEZAN ZAOBLJEN GR. DROBEN SLABO 40 34 31 26 26 25 25 24 17 17 17 15 12 12 GRUŠČ,MELJASTO PEŠEEN DOBRO GRAN. PEŠČENJAKA HUDOURNIŠKI MELJNA PEŠČN GIRADUIRAN GLINAST,GOST GRADUTRAN GRANUL. 11 11 11 10 10 9 8 8 8 6 6 6 6 6 PESEKZAMELJEN GRADUIPANPREPEPEL PŠČEN GPADUIRAN GRADUIRAN,GRADURAN POTOČNI PREPERL SAVSKICONA GLINASTEGAGRADUIRN 6 6 5 5 5 4 4 4 4 4 4 3 3 3 MELJAST, PEČEN PEŠČEN, PLASTI DELNO GLINA,GLINASTOGRADUIAN GRADULRAN GRDUIRAN GRUŠČ. KARB. KONGLOMERAT 3 3 3 3 2 2 2 2 2 2 2 2 2 2 KONGLOMERAT,MELJNEKOLIKOOKER PESEK, PEŠČCEN PEŠČEN. PLASTEH PODPPEPEREL RPOD UMAZAN ZAOBLJEN, - 2 2 2 2 2 2 2 2 2 2 2 2 2 1 (GRUŠČ)(KARBONATNI) APNENCADROBNOZRNAT, ENAKOMEREN GBADUIRANGLIANASTGLINASTA GPADUIRALN GPUŠČ GRADAUIRANGRADUIRA GRADUIRANPEŠČEN GRADUIRAU But the first code somhove looses the acutal value of the field and just gives the count apply(mat, 2, function(x) head(sort(table(x),decreasing=T),10)) VrtinaID ZapStev GlobinaOd GlobinaDo USCS Opis W1 W2 W3 W4 W5 W6 W7 W8 W9 W10 W11 W12 W13 W14 W15 [1,] 151248 282 290 2131 15 1820 1872 1677 1479 1441 1465 1261 769 848 1088 1490 1968 2459 2943 3408 [2,] 111119 198 235 1305 13 1791 1542 1495 1334 1317 1247 829 652 783 660 606 603 381 381 301 [3,] 111078 174 210 784 11 532 552 566 529 532 716 511 575 576 416 464 384 368 282 279 [4,] 11 835 147 173 691 11 471 519 390 351 358 571 364 521 556 381 398 352 287 282 259 [5,] 10 584 133 172 646 11 376 458 296 311 323 195 252 329 429 343 397 336 244 242 224 [6,] 10 389 123 142 386 10 253 214 237 268 310 130 233 265 376 263 378 258 228 210 205 [7,] 10 257 114 130 183 10 247 175 201 242 157 130 179 258 267 219 230 239 197 185 155 [8,]9 198 105 126 1489 135 174 157 170 146 102 163 213 266 215 221 188 197 179 155 [9,]9 171 10195 719 102 132 139 161 141 89 145 199 140 192 205 168 191 160 122 [10,]
Re: [R] avoid a loop
Hi: To mimic Sarah Goslee's reply within base R, either of these work: crossprod(t(as.matrix(xtabs( ~ a + b crossprod(t(as.matrix(table(a, b HTH, Dennis On Thu, Nov 4, 2010 at 12:42 PM, cory n corynis...@gmail.com wrote: Let's suppose I have userids and associated attributes... columns a and b a - c(1,1,1,2,2,3,3,3,3) b - c(a,b,c,a,d,a, b, e, f) so a unique list of a would be id - unique(a) I want a matrix like this... [,1] [,2] [,3] [1,]312 [2,]121 [3,]214 Where element i,j is the number of items in b that id[i] and id[j] share... So for example, in element [1,3] of the result matrix, I want to see 2. That is, id's 1 and 3 share two common elements in b, namely a and b. This is hard to articulate, so sorry for the terrible description here. The way I have solved it is to do a double loop, looping over every member of the id column and comparing it to every other member of id to see how many elements of b they share. This takes forever. Thanks cn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoid a loop
And to wrap it up and help you choose, here are four functions based on these emails (the first one is my own slight variant): library(ecodist) a - sample(1:1000, 10^4, replace = TRUE) b - sample(letters[1:6], 10^4, replace = TRUE) foo1 - function() { x - table(a, b) return(x %*% t(x)) } foo2 - function() { x - crosstab(a, b, rep(1, length(a))) return(x %*% t(x)) } foo3 - function() { sapply(1:1000, function(y) { sapply(1:1000, function(x) { length(intersect(b[a==y], b[a==x])) }) }) } foo4 - function() {crossprod(t(as.matrix(table(a, b} system.time(x1 - foo1()) user system elapsed 0.028 0.008 0.038 system.time(x2 - foo2()) user system elapsed 0.076 0.008 0.087 ## I got tired of waiting system.time(x3 - foo3()) menu-bar signals break Timing stopped at: 104.951 1.336 110.909 system.time(x4 - foo4()) user system elapsed 0.024 0.020 0.043 all.equal(x1, x2, check.attributes = FALSE) [1] TRUE all.equal(x1, x4, check.attributes = FALSE) [1] TRUE This suggests the speeds are: foo1 foo4 foo2 foo3 Cheers, Josh On Thu, Nov 4, 2010 at 12:42 PM, cory n corynis...@gmail.com wrote: Let's suppose I have userids and associated attributes... columns a and b a - c(1,1,1,2,2,3,3,3,3) b - c(a,b,c,a,d,a, b, e, f) so a unique list of a would be id - unique(a) I want a matrix like this... [,1] [,2] [,3] [1,] 3 1 2 [2,] 1 2 1 [3,] 2 1 4 Where element i,j is the number of items in b that id[i] and id[j] share... So for example, in element [1,3] of the result matrix, I want to see 2. That is, id's 1 and 3 share two common elements in b, namely a and b. This is hard to articulate, so sorry for the terrible description here. The way I have solved it is to do a double loop, looping over every member of the id column and comparing it to every other member of id to see how many elements of b they share. This takes forever. Thanks cn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop
Is this closer to what you want, assuming that it is the value of the most frequently occurring: apply(mat, 2, function(x) head(names(sort(table(x), decreasing=T)),5)) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 1 14 5 1 4 14 6 18 11 19 [2,] 3 3 13 12 3 11 14 9 18 12 [3,] 2 18 20 8 11 12 17 14 14 7 [4,] 5 11 8 19 5 18 18 15 16 10 [5,] 18 13 11 11 17 3 4 16 8 16 2010/11/4 Matevž Pavlič matevz.pav...@gi-zrmk.si: Hi again, Stil don't qute get it... Here's what i did : mat-read.csv(litologija.csv, dec=., sep=;) apply(mat, 2, function(x) head(sort(table(x),decreasing=T),10)) With that i get a table(list/matrix...) which gives the highest count of occurances of each value in a table (at least i think so) But the problem is because it does not tell which value occurs the most (has the highest count). If written like this : apply(mat, 2, function(x) sort(table(x),decreasing=T)) I get decreasingly sorted values of counts of occurances of a specific field and the value of that field for each column: $W2 x PEŠČEN GRADUIRAN IN PROD DO GLINAST PROD, PREPEREL MELJAST GRUŠČ GLINA Z MALO GRANULIRANA 1872 1542 552 519 458 214 175 174 132 114 62 53 47 45 ZELO PEŠČENA ZAGLINJEN KARBONATNI SKRILAVCA, S SKRILAVCA GRANULIRAN PEČŠEN VEZAN ZAOBLJEN GR. DROBEN SLABO 40 34 31 26 26 25 25 24 17 17 17 15 12 12 GRUŠČ, MELJASTO PEŠEEN DOBRO GRAN. PEŠČENJAKA HUDOURNIŠKI MELJNA PEŠČN GIRADUIRAN GLINAST, GOST GRADUTRAN GRANUL. 11 11 11 10 10 9 8 8 8 6 6 6 6 6 PESEK ZAMELJEN GRADUIPAN PREPEPEL PŠČEN GPADUIRAN GRADUIRAN, GRADURAN POTOČNI PREPERL SAVSKI CONA GLINASTEGA GRADUIRN 6 6 5 5 5 4 4 4 4 4 4 3 3 3 MELJAST, PEČEN PEŠČEN, PLASTI DELNO GLINA, GLINASTO GRADUIAN GRADULRAN GRDUIRAN GRUŠČ. KARB. KONGLOMERAT 3 3 3 3 2 2 2 2 2 2 2 2 2 2 KONGLOMERAT, MELJ NEKOLIKO OKER PESEK, PEŠČCEN PEŠČEN. PLASTEH POD PPEPEREL RPOD UMAZAN ZAOBLJEN, - 2 2 2 2 2 2 2 2 2 2 2 2 2 1 (GRUŠČ) (KARBONATNI) APNENCA DROBNOZRNAT, ENAKOMEREN GBADUIRAN GLIANAST GLINASTA GPADUIRALN GPUŠČ GRADAUIRAN GRADUIRA GRADUIRANPEŠČEN GRADUIRAU But the first code somhove looses the acutal value of the field and just gives the count apply(mat, 2, function(x) head(sort(table(x),decreasing=T),10)) VrtinaID ZapStev GlobinaOd GlobinaDo USCS Opis W1 W2 W3 W4 W5 W6 W7 W8 W9 W10 W11 W12 W13 W14 W15 [1,] 15 1248 282 290 2131 15 1820 1872 1677 1479 1441 1465 1261 769 848 1088 1490 1968 2459 2943 3408 [2,] 11 1119 198 235 1305 13 1791 1542 1495 1334 1317 1247 829 652 783 660 606 603 381 381 301 [3,] 11 1078 174 210 784 11 532 552 566 529 532 716 511 575 576 416 464 384 368 282 279 [4,] 11 835 147 173 691 11 471 519 390 351 358 571 364 521 556 381 398 352 287 282 259 [5,] 10 584 133 172 646 11 376 458 296 311 323 195 252 329 429 343 397 336 244 242 224 [6,] 10 389 123
[R] creating vectors with three variables out of three datasets
Hi there, i´ve got a problem with how to create a vector with three variables out of three seperate ascii files. These three ascii files contain pixel information of the same image but different bands and i need a matrix of vectors, with each vector containing the corresponding pixel values for each band. Up to now i´ve seperately read out the ascii files into three matrices but don´t know how to put the corresponding pixel values together. Looking forward to any help. Thank you Dominik -- View this message in context: http://r.789695.n4.nabble.com/creating-vectors-with-three-variables-out-of-three-datasets-tp3027852p3027852.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating vectors with three variables out of three datasets
DomDom wrote: Hi there, i´ve got a problem with how to create a vector with three variables out of three seperate ascii files. These three ascii files contain pixel information of the same image but different bands and i need a matrix of vectors, with each vector containing the corresponding pixel values for each band. Up to now i´ve seperately read out the ascii files into three matrices but don´t know how to put the corresponding pixel values together. Perhaps rbind or cbind, see ?rbind. It would be useful if you gave us a small, reproducible example of the type of data you have and what you want to do with it, please see the Posting Guide. Looking forward to any help. Thank you Dominik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating vectors with three variables out of three datasets
okay sorry. i´ve got three ascii files with pixel values without any header information. so if the first line of the three ascii files are: ascii1: 11 12 13 ascii2: 14 15 16 ascii3: 17 18 19 i would like a new matrix with: 11,14,17;12,15,18;13,16,19; thx -- View this message in context: http://r.789695.n4.nabble.com/creating-vectors-with-three-variables-out-of-three-datasets-tp3027852p3027880.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mgui
Hello I am using the mgui function in the following way: mgui ( graf_cuenta_margen_interfaz,title=c(Gráficas,Histogramas valoración (No lineal) Cuenta de Margen),exec=Graficar,argText=list(fecha_adelante=Fecha adelante),closeOnExec=TRUE,output=NULL,,helps=list(fecha_adelante=paste(La valoración de cuantos días adelante se desea graficar. Las opciones son los días que se hayan escogido en las simulacion:,guiGetSafe(horizontes_text if you notice for the help I am making a string which uses a variable that I can modify helps=list(fecha_adelante=paste(La valoración de cuantos días adelante se desea graficar. Las opciones son los días que se hayan escogido en las simulacion:,guiGetSafe(horizontes_text))) The problem is when I modify this variable if I have already used this option in my program when I use it again the variable seems not be actualized even though if I call it on the R console using guiGetSafe(horizontes_text) I can see the change. If for example I change the without using this option in my program before the changes DO appear. And if I want the change in my variable to appear in the program having used the option before I have to close it and open it again. Does anybody know how can I have the changes in my variable incorporated in the program eventhough I have used the option before without opening and closing it again?. Thank you Felipe Parra [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NA values
Hi, I tried to manage exponential family state-space model with the packages KFAS. The problem is that my data set includes some NA observation and it seems not working. Any suggestion? Thanks in advance, Federico -- View this message in context: http://r.789695.n4.nabble.com/R-pkgs-New-package-for-multivariate-Kalman-filtering-smoothing-simulation-and-forecasting-tp903589p3027907.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] count occurrence and distance of characters in string
Hello all, I want to know how often one character occurs in a given string and the distance from between every two occurences. (distance = other characters between them). thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop
Hi Jim, Actually, this is better, but both values are what i am looking for. Count and the value of the count. Is there a way to just paste those two together? Thanks, m -Original Message- From: jim holtman [mailto:jholt...@gmail.com] Sent: Thursday, November 04, 2010 9:59 PM To: Matevž Pavlič Cc: Petr PIKAL; r-help@r-project.org Subject: Re: [R] Loop Is this closer to what you want, assuming that it is the value of the most frequently occurring: apply(mat, 2, function(x) head(names(sort(table(x), decreasing=T)),5)) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 1 14 5 1 4 14 6 18 11 19 [2,] 3 3 13 12 3 11 14 9 18 12 [3,] 2 18 20 8 11 12 17 14 14 7 [4,] 5 11 8 19 5 18 18 15 16 10 [5,] 18 13 11 11 17 3 4 16 8 16 2010/11/4 Matevž Pavlič matevz.pav...@gi-zrmk.si: Hi again, Stil don't qute get it... Here's what i did : mat-read.csv(litologija.csv, dec=., sep=;) apply(mat, 2, function(x) head(sort(table(x),decreasing=T),10)) With that i get a table(list/matrix...) which gives the highest count of occurances of each value in a table (at least i think so) But the problem is because it does not tell which value occurs the most (has the highest count). If written like this : apply(mat, 2, function(x) sort(table(x),decreasing=T)) I get decreasingly sorted values of counts of occurances of a specific field and the value of that field for each column: $W2 x PEŠČEN GRADUIRAN IN PROD DO GLINAST PROD, PREPEREL MELJAST GRUŠČ GLINA Z MALO GRANULIRANA 1872 1542 552 519 458 214 175 174 132 114 62 53 47 45 ZELO PEŠČENA ZAGLINJEN KARBONATNI SKRILAVCA, S SKRILAVCA GRANULIRAN PEČŠEN VEZAN ZAOBLJEN GR. DROBEN SLABO 40 34 31 26 26 25 25 24 17 17 17 15 12 12 GRUŠČ, MELJASTO PEŠEEN DOBRO GRAN. PEŠČENJAKA HUDOURNIŠKI MELJNA PEŠČN GIRADUIRAN GLINAST, GOST GRADUTRAN GRANUL. 11 11 11 10 10 9 8 8 8 6 6 6 6 6 PESEK ZAMELJEN GRADUIPAN PREPEPEL PŠČEN GPADUIRAN GRADUIRAN, GRADURAN POTOČNI PREPERL SAVSKI CONA GLINASTEGA GRADUIRN 6 6 5 5 5 4 4 4 4 4 4 3 3 3 MELJAST, PEČEN PEŠČEN, PLASTI DELNO GLINA, GLINASTO GRADUIAN GRADULRAN GRDUIRAN GRUŠČ. KARB. KONGLOMERAT 3 3 3 3 2 2 2 2 2 2 2 2 2 2 KONGLOMERAT, MELJ NEKOLIKO OKER PESEK, PEŠČCEN PEŠČEN. PLASTEH POD PPEPEREL RPOD UMAZAN ZAOBLJEN, - 2 2 2 2 2 2 2 2 2 2 2 2 2 1 (GRUŠČ) (KARBONATNI) APNENCA DROBNOZRNAT, ENAKOMEREN GBADUIRAN GLIANAST GLINASTA GPADUIRALN GPUŠČ GRADAUIRAN GRADUIRA GRADUIRANPEŠČEN GRADUIRAU But the first code somhove looses the acutal value of the field and just gives the count apply(mat, 2, function(x) head(sort(table(x),decreasing=T),10)) VrtinaID ZapStev GlobinaOd GlobinaDo USCS Opis W1 W2 W3 W4 W5 W6 W7 W8 W9 W10 W11 W12 W13 W14 W15 [1,] 15 1248 282 290 2131 15 1820 1872 1677 1479 1441 1465 1261 769 848 1088 1490 1968 2459 2943 3408 [2,] 11 1119 198 235 1305 13 1791 1542 1495 1334 1317 1247 829 652 783 660 606 603 381 381 301 [3,] 11