Re: [R] Problems with reading data by readWorksheetFromFile of XLConnect Package
Hi Anthony, Thank you very much. It works very well. However, after this line temp - sapply( temp , as.numeric ) the data becomes a series of numbers instead of a matrix. Is there any way to keep it a matrix? Thanks, Miao temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) temp - sapply( temp , function( x ) gsub( ',' , '' , x ) ) temp Col1 Col2 Col3Col4 [1,] 647853 1413 57662 27897 [2,] 491400 1365 40919 20411 [3,] 38604 -5505 985 [4,] 576-2054 [5,] 80845 21 10211 4494 [6,] 36428 27 1007 1953 [7,] 269915 587 32988 12779 [8,] 224494 -30554 9184 [9,] 11858 587 - 686 [10,] 3742 -81415 temp - sapply( temp , as.numeric ) Warning messages: 1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion temp 647853 491400 38604576 80845 36428 269915 647853 491400 38604576 80845 36428 269915 224494 11858 3742 1413 1365 - - 224494 11858 3742 1413 1365 NA NA 21 27587 -587 - 57662 21 27587 NA587 NA 57662 40919 5505 20 10211 1007 32988 30554 40919 5505 20 10211 1007 32988 30554 - 81 27897 20411985 54 4494 NA 81 27897 20411985 54 4494 1953 12779 9184686415 1953 12779 9184686415 temp[ is.na( temp ) ] - 0 temp 647853 491400 38604576 80845 36428 269915 647853 491400 38604576 80845 36428 269915 224494 11858 3742 1413 1365 - - 224494 11858 3742 1413 1365 0 0 21 27587 -587 - 57662 21 27587 0587 0 57662 40919 5505 20 10211 1007 32988 30554 40919 5505 20 10211 1007 32988 30554 - 81 27897 20411985 54 4494 0 81 27897 20411985 54 4494 1953 12779 9184686415 1953 12779 9184686415 2013/5/2 Anthony Damico ajdam...@gmail.com try adding colTypes = 'numeric' to your readWorkSheetFromFile() call if that doesn't work, try a few other steps # view what data types your file is being read in as sapply( temp , class ) # convert all fields to character if they're factor variables.. but i don't think you need this, readWorksheet defaults to `character` temp - sapply( temp , as.character ) # you can also convert a subset like this temp[ , c( 1 , 3:4 ) ] - sapply( temp[ , c( 1 , 3:4 ) ] , as.character ) # remove commas from character strings temp - sapply( temp , function( x ) gsub( ',' , '' , x ) ) # convert all fields to numeric temp - sapply( temp , as.numeric ) # convert all NA fields to zeroes if you prefer temp[ is.na( temp ) ] - 0 On Wed, May 1, 2013 at 11:55 PM, jpm miao miao...@gmail.com wrote: Hi, Attached are two datasheet to be read. My raw data 130502temp.xlsx contains numbers with ' symbols, and they can't be read as numbers. Even if I copy and paste as numbers to form a new file 130502temp_number1.xlsx, they could not be read smoothly. 1. How can I read the datasheet as numbers? 2. How can I treat the notation - as (1) NA or (2) zero? Thanks, Miao temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) temp Col1 Col2 Col3 Col4 1 647,853 1,413 57,662 27,897 2 491,400 1,365 40,919 20,411 3 38,604 - 5,505985 4 576 - 20 54 5 80,84521 10,211 4,494 6 36,42827 1,007 1,953 7 269,915 587 32,988 12,779 8 224,494 - 30,554 9,184 9 11,858 587 -686 10 3,742 - 81415 temp[2,2] [1] 1,365 temp[2,2]+3 Error in temp[2, 2] + 3 : non-numeric argument to binary operator temp_num-readWorksheetFromFile(130502temp_number1.xlsx, sheet=1, header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) temp_num[2,2] [1] 1,365 temp_num[2,2]+3 Error in temp_num[2, 2] + 3 : non-numeric argument to binary operator as.numeric(temp_num[2,2])+3 [1] NA Warning message: NAs introduced by coercion __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] Create and read symbolic links in Windows
Thanks for the suggestions. In windows (Windows 7, 64-bit), I couldn't get file.symlink to work, but file.link did return the result to be TRUE but at the target location, I did not see any link. Not sure I am missing anything more.. Hope it's nothing to do with administrator accounts and administrator rights... Is it something I should check with my system administrator? Thanks, Santosh On Thu, May 2, 2013 at 12:22 PM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote: On 02/05/2013 19:50, Santosh wrote: Dear Rxperts.. Got a couple of quick q's.. I am using R in windows environment (both 32-bit and 64-bit) a) Is there a way to create symbolic links to some data files? See ?file.symlink. ??'symbolic link' should have got you there. Note that this is not very useful for files, but that is a Windows and not an R restriction. b) How do I read data from symbolic links? The same ways you read data from files. Thanks so much.. Santosh -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create and read symbolic links in Windows
On 03/05/2013 07:33, Santosh wrote: Thanks for the suggestions. In windows (Windows 7, 64-bit), I couldn't get file.symlink to work, but file.link did return the result to be TRUE but at the target location, I did not see any link. Not sure I am missing anything more.. Hope it's nothing to do with administrator accounts and administrator rights... Is it something I should check with my system administrator? You may need to update your R: although the posting guide asked you to do that before posting. There was a relevant bug fix in 2.15.3. Thanks, Santosh On Thu, May 2, 2013 at 12:22 PM, Prof Brian Ripley rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk wrote: On 02/05/2013 19:50, Santosh wrote: Dear Rxperts.. Got a couple of quick q's.. I am using R in windows environment (both 32-bit and 64-bit) a) Is there a way to create symbolic links to some data files? See ?file.symlink. ??'symbolic link' should have got you there. Note that this is not very useful for files, but that is a Windows and not an R restriction. b) How do I read data from symbolic links? The same ways you read data from files. Thanks so much.. Santosh -- Brian D. Ripley, rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~__ripley/ http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 tel:%2B44%201865%20272861 (self) 1 South Parks Road, +44 1865 272866 tel:%2B44%201865%20272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 tel:%2B44%201865%20272595 R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/__listinfo/r-help https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/__posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood
Hi all, I have run a regression and want to calculate the likelihood of obtaining the sample. Is there a way in which I can use R to get this likelihood value? Appreciate your help on this. The following are the details: raw_ols1=lm(data$LOSS~data$GDP+data$HPI+data$UE) summary(raw_ols1) Call: lm(formula = data$LOSS ~ data$GDP + data$HPI + data$UE) Residuals: Min 1Q Median 3QMax -0.0023859 -0.0006236 0.0002444 0.0006739 0.0017713 Coefficients: Estimate Std. Erro t value Pr(|t|) (Intercept)-3.940e-02 6.199e-03 -6.356 9.54e-06 *** data$GDP 3.467e-09 7.652e-09 0.453 0.656580 data$HPI 7.935e-05 1.875e-05 4.2320.000635 *** data$UE 6.858e-04 2.800e-04 2.449 0.026227 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.001198 on 16 degrees of freedom Multiple R-squared: 0.9528,Adjusted R-squared: 0.944 F-statistic: 107.8 on 3 and 16 DF, p-value: 7.989e-11 Thanks and regards, Preetam -- Preetam Pal (+91)-9432212774 M-Stat 2nd Year, Room No. N-114 Statistics Division, C.V.Raman Hall Indian Statistical Institute, B.H.O.S. Kolkata. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Factors and Multinomial Logistic Regression
On Thu, 02 May 2013 22:04:26 +0200, peter dalgaard pda...@gmail.com wrote: On May 2, 2013, at 20:33 , Lorenzo Isella wrote: On Wed, 01 May 2013 23:49:07 +0200, peter dalgaard pda...@gmail.com wrote: It still doesn't work! Apologies; since I had already imported nnet in my workspace, the script worked on my machine even without importing it explicitly (see the script at the end of the email). Sorry for the confusion. You still owe us an answer why you thought that this: Coefficients: (Intercept) science socst femalefemale low 1.912288 -0.02356494 -0.03892428 0.81659717 high -4.057284 0.02292179 0.04300323 -0.03287211 Std. Errors: (Intercept)science socst femalefemale low 1.127255 0.02097468 0.019516490.3909804 high1.222937 0.02087182 0.019889330.3500151 Residual Deviance: 388.0697 is at all different from the Stata output. As far as I can tell it is EXACTLY the same! Apologies for being insistent, but this will come up in Internet searches as I couldn't make R do what Stata does. You are right. I must have messed up my workspace... In any case, the idea that R is somehow inferior to stata never crossed my mind. Rather, I was puzzled because I (not R) could not reproduce an allegedly almost textbook-like example I found on the web. Many thanks for your help. Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] untar() error
Dear List, I have a list of 600+ *.gz files that I would like to extract and read the geotiffs contained within them. I tried using the untar() function to simplify this task but I am stumped by an error. I've combed the Internet for a solution without luck. The details are below, and any help in solving this matter is appreciated. files = list.files(path = J:/GIMMS/NDVI, pattern = data.tif.gz, all.files = TRUE, full.names = TRUE, recursive = TRUE, ignore.case = TRUE, include.dirs = TRUE) lapply(files, untar) Error in rawToChar(block[seq_len(ns)]) : embedded nul in string: 'II*\0à \001´ \0\0`G\0\0\fn\0\0¸â\0\0d»\0\0\020â\0\0¼\b\001\0h/\001\0\024V\001\0Ã|\001\0l£\001\0\030Ã\001\0Ãð\001\0p\027\002\0\034\002\0Ãd\002\0tâ¹\002\0 ²\002\0ÃÃ\002\0xÿ\002\0$\003\0ÃL\003\0|s\003' untar(files[1]) Error in rawToChar(block[seq_len(ns)]) : embedded nul in string: 'II*\0à \001´ \0\0`G\0\0\fn\0\0¸â\0\0d»\0\0\020â\0\0¼\b\001\0h/\001\0\024V\001\0Ã|\001\0l£\001\0\030Ã\001\0Ãð\001\0p\027\002\0\034\002\0Ãd\002\0tâ¹\002\0 ²\002\0ÃÃ\002\0xÿ\002\0$\003\0ÃL\003\0|s\003' untar(J:/GIMMS/NDVI/1981/81aug15a.n07-VIg/81aug15a.n07-VIg_data.tif.gz) Error in rawToChar(block[seq_len(ns)]) : embedded nul in string: 'II*\0à \001´ \0\0`G\0\0\fn\0\0¸â\0\0d»\0\0\020â\0\0¼\b\001\0h/\001\0\024V\001\0Ã|\001\0l£\001\0\030Ã\001\0Ãð\001\0p\027\002\0\034\002\0Ãd\002\0tâ¹\002\0 ²\002\0ÃÃ\002\0xÿ\002\0$\003\0ÃL\003\0|s\003' traceback() 3: rawToChar(block[seq_len(ns)]) 2: untar2(tarfile, files, list, exdir) 1: untar(files[1]) sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base ___ Hakim Abdi Doctoral Student Physical Geography and Ecosystem Science Lund University Sölvegatan 12, 223 62 Lund, Sweden Office: +46 (0) 46 2223132 Mobile: +46 (0) 73 9300116 Email: hakim.a...@nateko.lu.se [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Edmonton course: Regression, GLM GAM with R intro
We would like to announce the following statistics course: Data exploration, regression, GLM GAM. With introduction to R When: 26 - 30 August 2013. Where: Edmonton, Canada For details, see: http://www.highstat.com/statscourse.htm Course flyer: http://www.highstat.com/Courses/Flyer2013_09Canada.pdf Kind regards, Alain Zuur __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] significant test of two quadratic regression models (lm)
Hello, I am work with two quadratic regression models y=ax^2+bx+c with the function of lm. y1= observed migration distance of butterflies(y1=a1x^2+b1x+c1) y2= predicted migration distance of butterflies (based on body mass) (y2=a2x^2+b2x+c2) x= body mass of butterflies Now I would like to check the two regression model differ by testing if the coeffients (a, b, c) of the y1 and the y2 model differ (null hypothesis: a1=a2 and b1=b2 and c1=c2) Please kindly advise any significant test in R for the purpose. Also, please kindly advise how to apply Bonferroni procedure in the test if necessary. Thank you in advance. Elaine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Self-developed package -- installation
On 03.05.2013 07:54, PIKAL Petr wrote: Hi Probably others can give you some better insight but copying folder with package from one machine to another is possible until the installation is required by a new version of R (about each 3 years). Reinstallation may be required more often, and we expect that packages need to be reinstalled at least if x or y are increased in a new R-x.y.z release. In rather rare cases this also happens for patch level updates. There are examples where a reinstalltion is not required that often, but that is not guaranteed. Best, Uwe Ligges Petr -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Hui Du Sent: Thursday, May 02, 2013 6:55 PM To: r-help@r-project.org Subject: [R] Self-developed package -- installation Hi All, I have a question about package installation in R. We have developed a package, say 'ABC'. We have installed it in two machines, A and B by running 'Install Package(s) from local zip file'. Everything was fine. Right now, suppose that package got damaged in machine A and our zipped file is gone, My question is that may I directly copy ../library/ABC from machine B to machine A rather than running 'Install Package(s) from local zip file' (I don't have that zip file anymore)? Thanks. HXD [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood
I have run a regression and want to calculate the likelihood of obtaining the sample. Is there a way in which I can use R to get this likelihood value? See ?logLik And see also ?help.search and ??. You would have found the above by typing ??likelihood at the command line in R S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R does not subset
Hi everyone, I know there have been several requests regarding subsetting before, but none of them really helps with my problem: I'm trying to subset only infected individuals from the REC2 data.frame: str(REC2) 'data.frame':362 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58 66 17 $ year : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3 ... $ ccFLEDGE : int 6 6 6 5 6 7 6 7 6 5 ... $ rec2012 : int 2 1 2 2 1 2 1 1 1 0 ... $ binage : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ... $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2 2 1 ... $ all.rsLD : num -4.62 -6.19 -3.62 -4.19 -2.62 ... using either RECinf-REC2[which (REC2$INFECTION==Infected),] or RECinf-subset(REC2, INFECTION==Infected) in both cases I get empty data frame (0 observations): str(RECinf) 'data.frame':0 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: $ year : Factor w/ 8 levels Y2002,Y2003,..: $ ccFLEDGE : int $ rec2012 : int $ binage : Factor w/ 2 levels ad,juv: $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : $ all.rsLD : num When subsetting, R doesn't return any warning or error message. Besides, I used same codes many times before and they worked perfectly well. Any ideas why this case is different? Thanks for your help, Kasia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R does not subset
You have an extra space in the INFECTION factors. Use REC2[REC2$INFECTION==Infected ,] or subset(REC2, INFECTION==Infected ) No need to use which here. On May 3, 2013, at 5:48 AM, Katarzyna Kulma wrote: Hi everyone, I know there have been several requests regarding subsetting before, but none of them really helps with my problem: I'm trying to subset only infected individuals from the REC2 data.frame: str(REC2) 'data.frame':362 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58 66 17 $ year : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3 ... $ ccFLEDGE : int 6 6 6 5 6 7 6 7 6 5 ... $ rec2012 : int 2 1 2 2 1 2 1 1 1 0 ... $ binage : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ... $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2 2 1 ... $ all.rsLD : num -4.62 -6.19 -3.62 -4.19 -2.62 ... using either RECinf-REC2[which (REC2$INFECTION==Infected),] or RECinf-subset(REC2, INFECTION==Infected) in both cases I get empty data frame (0 observations): str(RECinf) 'data.frame':0 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: $ year : Factor w/ 8 levels Y2002,Y2003,..: $ ccFLEDGE : int $ rec2012 : int $ binage : Factor w/ 2 levels ad,juv: $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : $ all.rsLD : num When subsetting, R doesn't return any warning or error message. Besides, I used same codes many times before and they worked perfectly well. Any ideas why this case is different? Thanks for your help, Kasia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Size of a refClass instance
Good tip. Thanks Morgan. I agree that a different structure might (necessarily) be in order. I wanted to create a tree where nodes in a tree were of different derived sub-classes -- possibly holding more data and behaving polymorphically. OO programming seemed ideal for this: lots of small things with specialized behavior -- but this isn't R's strength. On May 2, 2013, at 4:57 PM, Martin Morgan wrote: On 05/01/2013 11:20 AM, David Kulp wrote: I'm using refClass for a complex multi-directional tree structure with possibly 100,000s of nodes. The refClass design is very impressive and I'd love to use it, but I've found that the size of refClass instances are very large and creation time is slow. For example, below is a RefClass and normal S4 class. The RefClass requires about 4KB per instance vs 500B for the S4 class -- based on adding the Ncells and Vcells of used memory reported by gc(). And instantiation is more than twice as slow for a RefClass. (R 2.14.2) Anyone have thoughts on this and whether there's any hope for improving resources on either front? Hi David -- not necessarily helpful but creating a few large objects is always better than creating many small in R, so perhaps re-conceptualize your data structure? As a rough analogy, instead of constructing a graph as a large number of 'Node' instances each pointing to one another, a graph could be represented as a data.frame containing columns of 'from' and 'to' indexes (neighbour-edge list, a few large objects) or as an adjacency matrix. One would also implement creation and update of the few large objects in an R-friendly (vectorized) way. Perhaps there are existing packages that already model the data you're interested in? If your multi-directional tree can be represented as a graph, then perhaps http://bioconductor.org/packages/release/bioc/html/graph.html including facilities in the Boost graph library (RBGL, on the Bioconductor web site, too) or the igraph package can be put to use. Martin I wonder what others are doing. I've been thinking about lightweight alternative implementations, but nothing particularly elegant has come to mind, yet! Thanks! simple - setRefClass('simple', fields = list(a = character, b=numeric) ) gc() system.time(simple.list - lapply(1:10, function(i) { simple$new(a='foo',b=i) })) gc() setClass('simple2', representation(a=character,b=numeric)) setMethod(initialize, simple2, function(.Object, a, b) { .Object@a - a .Object@b - b .Object }) gc() system.time(simple2.list - lapply(1:10, function(i) { new('simple2',a='foo',b=i) })) gc() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cURL ?
Dear Sir I tried to find cURL on web but I do not find reliable file; there are some files on http://curl.haxx.se/. But I do not know which is suitable for R and how to install? Kind Regards Jawad Hussain Ashraf VPO Aroop, Tehsil and District GujranwalaMobile phone# 03016673275 Date: Sun, 28 Apr 2013 19:07:05 +0100 From: rip...@stats.ox.ac.uk To: miyanja...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] unsupported url scheme On 28/04/2013 15:32, jawad hussain wrote: fileUrl - https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOADdownload.file(fileUrl,destfile=./data/Cameras.csv,method=curl) I tried it after installing package RCurl but it give error message: Error in download.file(fileUrl, destfile = Cameras.csv) : unsupported URL schemeI can you help me to solve this problem. JAWAD HUSSAIN ASHRAF Yes, simply install a version of cURL which supports that scheme, then re-install RCurl. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. That does apply to you, too. No HTML, tell us your sessionInfo() -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R does not subset
$ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2 it is a factor variable, so it takes numeric values, for Infected it is assigned value 1. subset(REC2, INFECTION==1) 2013/5/3 Jorge I Velez jorgeivanve...@gmail.com Hi Kasia, You need subset(REC2, INFECTION==Infected ) (note the space after Infected). HTH, Jorge.- On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma katarzyna.ku...@gmail.comwrote: Hi everyone, I know there have been several requests regarding subsetting before, but none of them really helps with my problem: I'm trying to subset only infected individuals from the REC2 data.frame: str(REC2) 'data.frame':362 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58 66 17 $ year : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3 ... $ ccFLEDGE : int 6 6 6 5 6 7 6 7 6 5 ... $ rec2012 : int 2 1 2 2 1 2 1 1 1 0 ... $ binage : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ... $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2 2 1 ... $ all.rsLD : num -4.62 -6.19 -3.62 -4.19 -2.62 ... using either RECinf-REC2[which (REC2$INFECTION==Infected),] or RECinf-subset(REC2, INFECTION==Infected) in both cases I get empty data frame (0 observations): str(RECinf) 'data.frame':0 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: $ year : Factor w/ 8 levels Y2002,Y2003,..: $ ccFLEDGE : int $ rec2012 : int $ binage : Factor w/ 2 levels ad,juv: $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : $ all.rsLD : num When subsetting, R doesn't return any warning or error message. Besides, I used same codes many times before and they worked perfectly well. Any ideas why this case is different? Thanks for your help, Kasia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Luis Iván Ortiz Valencia Doutorando Saúde Pública - Epidemiologia, IESC, UFRJ Estatístico Msc. Spatial Analyst Msc. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R does not subset
Hi Luis, thanks for the suggestion, but still nothing: RECinf2-subset(REC2, INFECTION==1) head(RECinf2) [1] RINGNOyear ccFLEDGE rec2012 binageINFECTION all.rsLD 0 rows (or 0-length row.names) cheers, Kasia Katarzyna Kulma PhD Student Department of Ecology and Genetics Institute of Ecology and Evolution/Animal Ecology Uppsala University Norbyvägen 18D SE-752 36 Uppsala, Sweden email: katarzyna.ku...@ebc.uu.se Tel.+46 (0)18 471 2672 Fax.+46 18 471 6484 On 3 May 2013 14:13, Luis Iván Ortiz Valencia liov2...@gmail.com wrote: $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2 it is a factor variable, so it takes numeric values, for Infected it is assigned value 1. subset(REC2, INFECTION==1) 2013/5/3 Jorge I Velez jorgeivanve...@gmail.com Hi Kasia, You need subset(REC2, INFECTION==Infected ) (note the space after Infected). HTH, Jorge.- On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma katarzyna.ku...@gmail.comwrote: Hi everyone, I know there have been several requests regarding subsetting before, but none of them really helps with my problem: I'm trying to subset only infected individuals from the REC2 data.frame: str(REC2) 'data.frame':362 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58 66 17 $ year : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3 ... $ ccFLEDGE : int 6 6 6 5 6 7 6 7 6 5 ... $ rec2012 : int 2 1 2 2 1 2 1 1 1 0 ... $ binage : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ... $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2 2 1 ... $ all.rsLD : num -4.62 -6.19 -3.62 -4.19 -2.62 ... using either RECinf-REC2[which (REC2$INFECTION==Infected),] or RECinf-subset(REC2, INFECTION==Infected) in both cases I get empty data frame (0 observations): str(RECinf) 'data.frame':0 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: $ year : Factor w/ 8 levels Y2002,Y2003,..: $ ccFLEDGE : int $ rec2012 : int $ binage : Factor w/ 2 levels ad,juv: $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : $ all.rsLD : num When subsetting, R doesn't return any warning or error message. Besides, I used same codes many times before and they worked perfectly well. Any ideas why this case is different? Thanks for your help, Kasia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Luis Iván Ortiz Valencia Doutorando Saúde Pública - Epidemiologia, IESC, UFRJ Estatístico Msc. Spatial Analyst Msc. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R does not subset
Hi Kasia, You need subset(REC2, INFECTION==Infected ) (note the space after Infected). HTH, Jorge.- On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma katarzyna.ku...@gmail.comwrote: Hi everyone, I know there have been several requests regarding subsetting before, but none of them really helps with my problem: I'm trying to subset only infected individuals from the REC2 data.frame: str(REC2) 'data.frame':362 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58 66 17 $ year : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3 ... $ ccFLEDGE : int 6 6 6 5 6 7 6 7 6 5 ... $ rec2012 : int 2 1 2 2 1 2 1 1 1 0 ... $ binage : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ... $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2 2 1 ... $ all.rsLD : num -4.62 -6.19 -3.62 -4.19 -2.62 ... using either RECinf-REC2[which (REC2$INFECTION==Infected),] or RECinf-subset(REC2, INFECTION==Infected) in both cases I get empty data frame (0 observations): str(RECinf) 'data.frame':0 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: $ year : Factor w/ 8 levels Y2002,Y2003,..: $ ccFLEDGE : int $ rec2012 : int $ binage : Factor w/ 2 levels ad,juv: $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : $ all.rsLD : num When subsetting, R doesn't return any warning or error message. Besides, I used same codes many times before and they worked perfectly well. Any ideas why this case is different? Thanks for your help, Kasia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R does not subset
Jorge, thanks for your suggestions, but they give the same (empty) result: RECinf-subset(REC2, INFECTION==Infected) head(RECinf) [1] RINGNOyear ccFLEDGE rec2012 binageINFECTION all.rsLD 0 rows (or 0-length row.names) but David's suggestion worked! : RECinf-REC2[REC2$INFECTION==Infected ,] head(RECinf) RINGNO year ccFLEDGE rec2012 binage INFECTION all.rsLD 2 BX23298 Y20036 1juv Infected -6.1938776 4 BT53646 Y20035 2 ad Infected -4.1938776 7 BT53248 Y20036 1 ad Infected -2.1938776 11 BY75833 Y20045 0 ad Infected -4.6574803 13 BX23067 Y20046 0 ad Infected -3.6574803 17 BX24240 Y20046 0 ad Infected 0.3425197 still not sure why the subset() function didn't work, though. Thanks for your help! Katarzyna Kulma PhD Student Department of Ecology and Genetics Institute of Ecology and Evolution/Animal Ecology Uppsala University Norbyvägen 18D SE-752 36 Uppsala, Sweden email: katarzyna.ku...@ebc.uu.se Tel.+46 (0)18 471 2672 Fax.+46 18 471 6484 On 3 May 2013 13:13, David Kulp dk...@fiksu.com wrote: You have an extra space in the INFECTION factors. Use REC2[REC2$INFECTION==Infected ,] or subset(REC2, INFECTION==Infected ) No need to use which here. On May 3, 2013, at 5:48 AM, Katarzyna Kulma wrote: Hi everyone, I know there have been several requests regarding subsetting before, but none of them really helps with my problem: I'm trying to subset only infected individuals from the REC2 data.frame: str(REC2) 'data.frame':362 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58 66 17 $ year : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3 ... $ ccFLEDGE : int 6 6 6 5 6 7 6 7 6 5 ... $ rec2012 : int 2 1 2 2 1 2 1 1 1 0 ... $ binage : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ... $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2 2 1 ... $ all.rsLD : num -4.62 -6.19 -3.62 -4.19 -2.62 ... using either RECinf-REC2[which (REC2$INFECTION==Infected),] or RECinf-subset(REC2, INFECTION==Infected) in both cases I get empty data frame (0 observations): str(RECinf) 'data.frame':0 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: $ year : Factor w/ 8 levels Y2002,Y2003,..: $ ccFLEDGE : int $ rec2012 : int $ binage : Factor w/ 2 levels ad,juv: $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : $ all.rsLD : num When subsetting, R doesn't return any warning or error message. Besides, I used same codes many times before and they worked perfectly well. Any ideas why this case is different? Thanks for your help, Kasia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] untar() error
On 03/05/2013 08:31, Hakim Abdi wrote: Dear List, I have a list of 600+ *.gz files that I would like to extract and read the geotiffs contained within them. I tried using the untar() function to simplify this task but I am stumped by an error. I've combed the Internet for a solution without luck. The details are below, and any help in solving this matter is appreciated. Those are most likely not tar files. What does file (the command-line program contained in Rtools) say they are? files = list.files(path = J:/GIMMS/NDVI, pattern = data.tif.gz, all.files = TRUE, full.names = TRUE, recursive = TRUE, ignore.case = TRUE, include.dirs = TRUE) lapply(files, untar) Error in rawToChar(block[seq_len(ns)]) : embedded nul in string: 'II*\0Ì \001´ \0\0`G\0\0\fn\0\0¸�\0\0d»\0\0\020â\0\0¼\b\001\0h/\001\0\024V\001\0À|\001\0l£\001\0\030Ê\001\0Äð\001\0p\027\002\0\034\002\0Èd\002\0t‹\002\0 ²\002\0ÌØ\002\0xÿ\002\0$\003\0�L\003\0|s\003' untar(files[1]) Error in rawToChar(block[seq_len(ns)]) : embedded nul in string: 'II*\0Ì \001´ \0\0`G\0\0\fn\0\0¸�\0\0d»\0\0\020â\0\0¼\b\001\0h/\001\0\024V\001\0À|\001\0l£\001\0\030Ê\001\0Äð\001\0p\027\002\0\034\002\0Èd\002\0t‹\002\0 ²\002\0ÌØ\002\0xÿ\002\0$\003\0�L\003\0|s\003' untar(J:/GIMMS/NDVI/1981/81aug15a.n07-VIg/81aug15a.n07-VIg_data.tif.gz) Error in rawToChar(block[seq_len(ns)]) : embedded nul in string: 'II*\0Ì \001´ \0\0`G\0\0\fn\0\0¸�\0\0d»\0\0\020â\0\0¼\b\001\0h/\001\0\024V\001\0À|\001\0l£\001\0\030Ê\001\0Äð\001\0p\027\002\0\034\002\0Èd\002\0t‹\002\0 ²\002\0ÌØ\002\0xÿ\002\0$\003\0�L\003\0|s\003' traceback() 3: rawToChar(block[seq_len(ns)]) 2: untar2(tarfile, files, list, exdir) 1: untar(files[1]) sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base ___ Hakim Abdi Doctoral Student Physical Geography and Ecosystem Science Lund University Sölvegatan 12, 223 62 Lund, Sweden Office: +46 (0) 46 2223132 Mobile: +46 (0) 73 9300116 Email: hakim.a...@nateko.lu.se [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R does not subset
Hi: (note the space after Infected) Since I lost a morning too with this issue, I am just curious, why is there a space? I know, it must be a dumb question, a reasonable programming rule, but that's my level :-) mike From: Jorge I Velez jorgeivanve...@gmail.com To:Katarzyna Kulma katarzyna.ku...@gmail.com Cc: R mailing list r-help@r-project.org Sent: Friday, May 3, 2013 6:01 AM Subject: Re: [R] R does not subset Hi Kasia, You need subset(REC2, INFECTION==Infected ) (note the space after Infected). HTH, Jorge.- On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma katarzyna.ku...@gmail.comwrote: Hi everyone, I know there have been several requests regarding subsetting before, but none of them really helps with my problem: I'm trying to subset only infected individuals from the REC2 data.frame: str(REC2) 'data.frame': 362 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58 66 17 $ year : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3 ... $ ccFLEDGE : int 6 6 6 5 6 7 6 7 6 5 ... $ rec2012 : int 2 1 2 2 1 2 1 1 1 0 ... $ binage : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ... $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2 2 1 ... $ all.rsLD : num -4.62 -6.19 -3.62 -4.19 -2.62 ... using either RECinf-REC2[which (REC2$INFECTION==Infected),] or RECinf-subset(REC2, INFECTION==Infected) in both cases I get empty data frame (0 observations): str(RECinf) 'data.frame': 0 obs. of 7 variables: $ RINGNO : Factor w/ 370 levels BL17546,BL17577,..: $ year : Factor w/ 8 levels Y2002,Y2003,..: $ ccFLEDGE : int $ rec2012 : int $ binage : Factor w/ 2 levels ad,juv: $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : $ all.rsLD : num When subsetting, R doesn't return any warning or error message. Besides, I used same codes many times beforeand they worked perfectly well. Any ideas why this case is different? Thanks for your help, Kasia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Very basic statistics in R
Dear all, Very simple question, but apparently uneasy to solve in R: I have a sampling of a variable x: (3, 4. 5, 2, ...) I want to know: - the mean x - mean(x) - the uncertainty on x - std.error(x) ? Or sd(x)? - the standard deviation of x - ? - the uncertainty on the standard deviation - ? Anyone has an idea? Thanks in advance, regards, Xavier -- *--- Xavier Prudent * * Computational biology and evolutionary genomics * * * *Guest scientist at the Max-Planck-Institut für Physik komplexer Systeme* *(MPI-PKS)* *Noethnitzer Str. 38* *01187 Dresden * * * *Max Planck-Institute for Molecular Cell Biology and Genetics* * (MPI-CBG) * * Pfotenhauerstraße 108 * * 01307 Dresden * * * * Phone: +49 351 210-2621 * *Mail: prudent [ at ] mpi-cbg.de **---* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Courses: Statistical Analysis with R - Bayesian Data Analysis with R and WinBUGS
Dear list members, Apologies for cross-posting. Please, find below the information of two statistical courses with R: 1) Statistical Analysis with R 2) Bayesian Data Analysis with R and WinBUGS If you have any question don't hesitate to contact me. Best regards, Pablo ++ *Two days course in: Statistical Analysis with R *Where: Linux Hotel, Essen-Horst, Germany *When: 14.06-15.06.2013 22.11-23.11.2013 13.12-14.12.2013 *Instructor: Dr. Pablo E. Verde ++ *Target audience: Data analysis with basic knowledge in statistics will benefit from this course. The course is intended as a first course in R but not as a first course in statistics or data analysis. ++ *Course content: Day 1: *Introduction to statistical analysis with R *Classical graphical functions (scatter plots, conditional plots, histograms, etc) *Data management with R (indexing and other advanced techniques) *Advance graphical techniques for data analysis: lattice plots and ggplot2 Day 2: *Statistical analysis based on computer simulation (bootstrap methods) *Regression modeling (linear/non-linear/logistic regression) *Issues in regression modeling (variable selection, model checking, etc.) *Prices: Public sector and commercial: 737.8 Euros (two days course, included VAT) Student: 450 Euro (two days course, included VAT). Some of the courses are frequently fully booked. So please notice that you may have to try several times, until you get a spare place. ++ ++ *Three days course in: Bayesian Data Analysis with R and WinBUGS *Where: Linux Hotel, Essen-Horst, Germany *When: 11.07-13.07.2013 07.11-09.11.2013 *Instructor: Dr. Pablo E. Verde ++ *Target audience: This course is for data analyst who are familiar with classical statistics and they want to get a working knowledge in Bayesian analysis. This is a 3 days intensive training course with 8 hours per day including lecturing and exercises. The course presentation is practical with many worked examples. To attend the course you do NOT need experience with R or with WinBUGS. Lectures are given in English. Discussions can be in English, German or Spanish. ++ *Course content: Day 1 *Lecture 1: Introduction to Bayesian Inference *Lecture 2: Bayesian analysis for single parameter models *Lecture 3: Prior distributions: univariate Day 2 *Lecture 4: Bayesian analysis for multiple parameter models *Lecture 5: An introduction to WinBUGS *Lecture 6: Multivariate models with WinBUGS Day 3 *Lecture 7: An introduction to MCMC computations *Lecture 8: Bayesian regression with WinBUGS *Lecture 9: Introduction to Hierarchical Statistical modeling *Prices: Public sector and commercial: 1088,85 Euro (three days course, included VAT) Student: 675 Euro (three days course, included VAT). Some of the courses are frequently fully booked. So please notice that you may have to try several times, until you get a spare place. ++ **For more information, please contact: i...@linuxhotel.de ++ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cURL ?
If you don't know, we certainly don't. This is not a question about R or RCurl anymore... it is a question about cURL. You need to know what operating system your computer uses and how to enable SSL for cURL on that operating system... perhaps you need local technical assistance. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. jawad hussain miyanja...@hotmail.com wrote: Dear Sir I tried to find cURL on web but I do not find reliable file; there are some files on http://curl.haxx.se/. But I do not know which is suitable for R and how to install? Kind Regards Jawad Hussain Ashraf VPO Aroop, Tehsil and District GujranwalaMobile phone# 03016673275 Date: Sun, 28 Apr 2013 19:07:05 +0100 From: rip...@stats.ox.ac.uk To: miyanja...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] unsupported url scheme On 28/04/2013 15:32, jawad hussain wrote: fileUrl - https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOADdownload.file(fileUrl,destfile=./data/Cameras.csv,method=curl) I tried it after installing package RCurl but it give error message: Error in download.file(fileUrl, destfile = Cameras.csv) : unsupported URL schemeI can you help me to solve this problem. JAWAD HUSSAIN ASHRAF Yes, simply install a version of cURL which supports that scheme, then re-install RCurl. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. That does apply to you, too. No HTML, tell us your sessionInfo() -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R does not subset
This typically occurs because of sloppy manual data entry outside of R. To relieve further analysis pain, you can manually clean the data (usually only effective for one-time analyses) or use R to fix problems right after loading the data (there are multiple methods for doing this... I prefer using ?sub on character data before creating the factor). --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Mihai Nica mihain...@yahoo.com wrote: Hi: (note the space after Infected) Since I lost a morning too with this issue, I am just curious, why is there a space?� I know, it must be a dumb question, a reasonable programming rule, but that's my level :-) � mike From: Jorge I Velez jorgeivanve...@gmail.com To:Katarzyna Kulma katarzyna.ku...@gmail.com Cc: R mailing list r-help@r-project.org Sent: Friday, May 3, 2013 6:01 AM Subject: Re: [R] R does not subset Hi Kasia, You need subset(REC2,� INFECTION==Infected ) (note the space after Infected). HTH, Jorge.- On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma katarzyna.ku...@gmail.comwrote: Hi everyone, I know there have been several requests regarding subsetting before, but none of them really helps with my problem: I'm trying to subset only infected individuals from the REC2 data.frame: str(REC2) 'data.frame':� � 362 obs. of� 7 variables: � $ RINGNO� : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58 66 17 � $ year� � : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3 ... � $ ccFLEDGE : int� 6 6 6 5 6 7 6 7 6 5 ... � $ rec2012� : int� 2 1 2 2 1 2 1 1 1 0 ... � $ binage� : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ... � $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2 2 1 ... � $ all.rsLD : num� -4.62 -6.19 -3.62 -4.19 -2.62 ... using either RECinf-REC2[which (REC2$INFECTION==Infected),] or RECinf-subset(REC2,� INFECTION==Infected) in both cases I get empty data frame (0 observations): str(RECinf) 'data.frame':� � 0 obs. of� 7 variables: � $ RINGNO� : Factor w/ 370 levels BL17546,BL17577,..: � $ year� � : Factor w/ 8 levels Y2002,Y2003,..: � $ ccFLEDGE : int � $ rec2012� : int � $ binage� : Factor w/ 2 levels ad,juv: � $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : � $ all.rsLD : num When subsetting, R doesn't return any warning or error message. Besides, I used same codes many times beforeand they worked perfectly well. Any ideas why this case is different? Thanks for your help, Kasia � � � � [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ��� [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Very basic statistics in R
- the mean x - mean(x) - the uncertainty on x - std.error(x) ? Or sd(x)? - the standard deviation of x - ? - the uncertainty on the standard deviation - ? Anyone has an idea? 1. Use R's help system to look up 'standard deviation' and 'mean' e.g.: ??'standard deviation' ??'mean' For the other two questions, consult your basic stats textbook; the answers can be calculated from the two above together with the number of observations. *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] print multiple plots to jpeg, one lattice and one ggplot2
hello everybody, I want to print two plots in one png file, I tried several options but i didn't succeed the first plot (bwplot) print to the defined position, but the second (ggplot) doesn't Any idea? Thanks a lot Christophe # Example: #- library(ggplot2) library(lattice) library(grid) one - bwplot(decrease ~ treatment, OrchardSprays, groups = rowpos, panel = panel.superpose, panel.groups = panel.linejoin, xlab = treatment, key = list(lines = Rows(trellis.par.get(superpose.line), c(1:7, 1)), text = list(lab = as.character(unique(OrchardSprays$rowpos))), columns = 4, title = Row position)) df - data.frame(gp = factor(rep(letters[1:3], each = 10)), y = rnorm(30)) # Compute sample mean and standard deviation in each group library(plyr) ds - ddply(df, .(gp), summarise, mean = mean(y), sd = sd(y)) two - ggplot(df, aes(x = gp, y = y)) + geom_point() + geom_point(data = ds, aes(y = mean), colour = 'red', size = 3) # 1. not working jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600, height = 400, units=px, res=100) print(one, position=c(0,0,0.5,1), more=TRUE) print(two, position=c(0.5,0,1,1), ) dev.off() # 2 not working jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600, height = 400, units=px, res=100) grid.newpage() pushViewport(viewport(layout = grid.layout(1, 2))) print(one, vp = viewport(layout.pos.row = 1, layout.pos.col = 1)) # ça ne fonctionne pas print(two, vp = viewport(layout.pos.row = 1, layout.pos.col = 2)) dev.off() # 3 not working jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600, height = 400, units=px, res=100) par(mfrow=c(1,2)) one two dev.off() [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating distance matrix for large dataset
Here's the result on R 3.0.0 64 bit under Windows 8: A-matrix(1:365000*144,nrow=365000,ncol=144) dim(A) [1] 365000144 d - dist(mydata_nor, method = euclidean) Error in as.matrix(x) : object 'mydata_nor' not found d - dist(A, method = euclidean) Error: cannot allocate vector of size 496.3 Gb In addition: Warning messages: 1: In dist(A, method = euclidean) : Reached total allocation of 8078Mb: see help(memory.size) 2: In dist(A, method = euclidean) : Reached total allocation of 8078Mb: see help(memory.size) 3: In dist(A, method = euclidean) : Reached total allocation of 8078Mb: see help(memory.size) 4: In dist(A, method = euclidean) : Reached total allocation of 8078Mb: see help(memory.size) Your message suggests that your system could not accurately compute the requirements. Unless you have access to a computer with 500 gigabytes, you need to consider alternate approaches such as aggregating the data into longer time blocks or using kmeans. - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of HJ YAN Sent: Thursday, May 2, 2013 6:02 PM To: r-help@r-project.org Subject: [R] Calculating distance matrix for large dataset Dear R users I wondered if any of you ever tried to calculate distance matrix with very large data set, and if anyone out there can confirm this error message I got actually mean that my data is too large for this task. negative length vectors are not allowed My data size and code used dim(mydata_nor)[1] 365000144 d - dist(mydata_nor, method = euclidean) Here my data has 1000 samples each has a year data observed by 10 minutes interval daily, so the size is (365* 1000) * 144. I checked the manual of function 'dist' but can not see the upper limit size allowed, and I bet there should be one, so any hints is appreciated. I would also be grateful if any other method for calculating distance matrix for large dataset could be advised. I appreciate reproducible code should be provided for your advice, so try below if needed: A-matrix(1:365000*144,nrow=365000,ncol=144) dim(A)[1] 365000144 d1-dist(A,method=euclidean)Error in dist(A, method = euclidean) : negative length vectors are not allowed Many thanks in advance! HJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Very basic statistics in R
I recommend you read the Introduction to R document that comes with R. Look for making vectors with the c() function, and using the mean() and sd() functions. Note that this is not a homework help forum (read the Posting Guide mentioned at the bottom of every message). If this is not homework, you are going to need to do quite a bit of self study before you can ask questions clearly enough to get useful responses on this list. See http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Xavier Prudent prudentxav...@gmail.com wrote: Dear all, Very simple question, but apparently uneasy to solve in R: I have a sampling of a variable x: (3, 4. 5, 2, ...) I want to know: - the mean x - mean(x) - the uncertainty on x - std.error(x) ? Or sd(x)? - the standard deviation of x - ? - the uncertainty on the standard deviation - ? Anyone has an idea? Thanks in advance, regards, Xavier __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] untar() error
untar != gunzip --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Prof Brian Ripley rip...@stats.ox.ac.uk wrote: On 03/05/2013 08:31, Hakim Abdi wrote: Dear List, I have a list of 600+ *.gz files that I would like to extract and read the geotiffs contained within them. I tried using the untar() function to simplify this task but I am stumped by an error. I've combed the Internet for a solution without luck. The details are below, and any help in solving this matter is appreciated. Those are most likely not tar files. What does file (the command-line program contained in Rtools) say they are? files = list.files(path = J:/GIMMS/NDVI, pattern = data.tif.gz, all.files = TRUE, full.names = TRUE, recursive = TRUE, ignore.case = TRUE, include.dirs = TRUE) lapply(files, untar) Error in rawToChar(block[seq_len(ns)]) : embedded nul in string: 'II*\0ÃŒ \001´ \0\0`G\0\0\fn\0\0¸â€\0\0d»\0\0\020â\0\0¼\b\001\0h/\001\0\024V\001\0À|\001\0l£\001\0\030Ê\001\0Äð\001\0p\027\002\0\034\002\0Èd\002\0t‹\002\0 ²\002\0ÌØ\002\0xÿ\002\0$\003\0ÃL\003\0|s\003' untar(files[1]) Error in rawToChar(block[seq_len(ns)]) : embedded nul in string: 'II*\0ÃŒ \001´ \0\0`G\0\0\fn\0\0¸â€\0\0d»\0\0\020â\0\0¼\b\001\0h/\001\0\024V\001\0À|\001\0l£\001\0\030Ê\001\0Äð\001\0p\027\002\0\034\002\0Èd\002\0t‹\002\0 ²\002\0ÌØ\002\0xÿ\002\0$\003\0ÃL\003\0|s\003' untar(J:/GIMMS/NDVI/1981/81aug15a.n07-VIg/81aug15a.n07-VIg_data.tif.gz) Error in rawToChar(block[seq_len(ns)]) : embedded nul in string: 'II*\0ÃŒ \001´ \0\0`G\0\0\fn\0\0¸â€\0\0d»\0\0\020â\0\0¼\b\001\0h/\001\0\024V\001\0À|\001\0l£\001\0\030Ê\001\0Äð\001\0p\027\002\0\034\002\0Èd\002\0t‹\002\0 ²\002\0ÌØ\002\0xÿ\002\0$\003\0ÃL\003\0|s\003' traceback() 3: rawToChar(block[seq_len(ns)]) 2: untar2(tarfile, files, list, exdir) 1: untar(files[1]) sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base ___ Hakim Abdi Doctoral Student Physical Geography and Ecosystem Science Lund University Sölvegatan 12, 223 62 Lund, Sweden Office: +46 (0) 46 2223132 Mobile: +46 (0) 73 9300116 Email: hakim.a...@nateko.lu.se [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Size of a refClass instance
Interesting conclusion. Alternatively, that representation of your object model may not be computationally effective. This discrepancy may be less exaggerated in C++, but you may still find that large numbers of objects are less efficient in their use of memory or cpu time than vector processing even there. I would read the point of Martin's response as Don't confuse your mental model of the solution with its implementation. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. David Kulp dk...@fiksu.com wrote: Good tip. Thanks Morgan. I agree that a different structure might (necessarily) be in order. I wanted to create a tree where nodes in a tree were of different derived sub-classes -- possibly holding more data and behaving polymorphically. OO programming seemed ideal for this: lots of small things with specialized behavior -- but this isn't R's strength. On May 2, 2013, at 4:57 PM, Martin Morgan wrote: On 05/01/2013 11:20 AM, David Kulp wrote: I'm using refClass for a complex multi-directional tree structure with possibly 100,000s of nodes. The refClass design is very impressive and I'd love to use it, but I've found that the size of refClass instances are very large and creation time is slow. For example, below is a RefClass and normal S4 class. The RefClass requires about 4KB per instance vs 500B for the S4 class -- based on adding the Ncells and Vcells of used memory reported by gc(). And instantiation is more than twice as slow for a RefClass. (R 2.14.2) Anyone have thoughts on this and whether there's any hope for improving resources on either front? Hi David -- not necessarily helpful but creating a few large objects is always better than creating many small in R, so perhaps re-conceptualize your data structure? As a rough analogy, instead of constructing a graph as a large number of 'Node' instances each pointing to one another, a graph could be represented as a data.frame containing columns of 'from' and 'to' indexes (neighbour-edge list, a few large objects) or as an adjacency matrix. One would also implement creation and update of the few large objects in an R-friendly (vectorized) way. Perhaps there are existing packages that already model the data you're interested in? If your multi-directional tree can be represented as a graph, then perhaps http://bioconductor.org/packages/release/bioc/html/graph.html including facilities in the Boost graph library (RBGL, on the Bioconductor web site, too) or the igraph package can be put to use. Martin I wonder what others are doing. I've been thinking about lightweight alternative implementations, but nothing particularly elegant has come to mind, yet! Thanks! simple - setRefClass('simple', fields = list(a = character, b=numeric) ) gc() system.time(simple.list - lapply(1:10, function(i) { simple$new(a='foo',b=i) })) gc() setClass('simple2', representation(a=character,b=numeric)) setMethod(initialize, simple2, function(.Object, a, b) { .Object@a - a .Object@b - b .Object }) gc() system.time(simple2.list - lapply(1:10, function(i) { new('simple2',a='foo',b=i) })) gc() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Write date class as number of days from 1970
Hi, May be this helps: set.seed(24) dat1- data.frame(date1=sample(seq(as.Date(2012-09-14,format=%Y-%m-%d),length.out=40,by=day),20,replace=FALSE), value=sample(1:60,20,replace=TRUE)) dat1$days1- as.numeric(difftime(dat1$date1,as.Date(1970-01-01))) #or library(lubridate) dat1$days2- days(dat1$date1)$day head(dat1) # date1 value days1 days2 #1 2012-09-25 6 15608 15608 #2 2012-09-22 34 15605 15605 #3 2012-10-10 44 15623 15623 #4 2012-10-03 9 15616 15616 #5 2012-10-07 14 15620 15620 #6 2012-10-16 42 15629 15629 #or library(chron) as.numeric(as.chron(dat1$date1)-chron(0)) #[1] 15608 15605 15623 15616 15620 15629 15606 15622 15631 15604 15615 15607 #[13] 15626 15624 15635 15619 15601 15598 15636 15599 A.K. Dear all, I have a dataset with one column being of class Date. When I write the output, I would like that column being written as number of days from 1970-01-01. I could not find anywhere a way to do it. Thanks, Marco __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read .csv file and plot a graph
Hi all, I have a big .csv file (21Mb with 100 rows) it has this shape: x 1 NaN 2 NaN 3 0.23 and so on. So the first column has x as a header then row number, the second column contains values between -1,1 and NaN for empty values. What should I need to do is: create a new .csv file from this one excluding NaN values and plot a line graph using the new .csv file. Or can I use the old .csv file to plot a graph excluding NaN values. Thanks in advance for any help or suggestions. Regards, Vahe [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Write date class as number of days from 1970
Dear all, I have a dataset with one column being of class Date. When I write the output, I would like that column being written as number of days from 1970-01-01. I could not find anywhere a way to do it. Thanks, Marco -- View this message in context: http://r.789695.n4.nabble.com/Write-date-class-as-number-of-days-from-1970-tp4666155.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with reading data by readWorksheetFromFile of XLConnect Package
sorry, i had assumed readWorksheetFromFile would give you back a data frame. all of the operations i recommended work on data.frame objects at different points in the code, check if it's a data.frame or a matrix.. class( temp ) ..you can check its current class at any point. and if it's a matrix, you can convert it to a data frame with temp - as.data.frame( temp ) On Fri, May 3, 2013 at 2:00 AM, jpm miao miao...@gmail.com wrote: Hi Anthony, Thank you very much. It works very well. However, after this line temp - sapply( temp , as.numeric ) the data becomes a series of numbers instead of a matrix. Is there any way to keep it a matrix? Thanks, Miao temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) temp - sapply( temp , function( x ) gsub( ',' , '' , x ) ) temp Col1 Col2 Col3Col4 [1,] 647853 1413 57662 27897 [2,] 491400 1365 40919 20411 [3,] 38604 -5505 985 [4,] 576-2054 [5,] 80845 21 10211 4494 [6,] 36428 27 1007 1953 [7,] 269915 587 32988 12779 [8,] 224494 -30554 9184 [9,] 11858 587 - 686 [10,] 3742 -81415 temp - sapply( temp , as.numeric ) Warning messages: 1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion temp 647853 491400 38604576 80845 36428 269915 647853 491400 38604576 80845 36428 269915 224494 11858 3742 1413 1365 - - 224494 11858 3742 1413 1365 NA NA 21 27587 -587 - 57662 21 27587 NA587 NA 57662 40919 5505 20 10211 1007 32988 30554 40919 5505 20 10211 1007 32988 30554 - 81 27897 20411985 54 4494 NA 81 27897 20411985 54 4494 1953 12779 9184686415 1953 12779 9184686415 temp[ is.na( temp ) ] - 0 temp 647853 491400 38604576 80845 36428 269915 647853 491400 38604576 80845 36428 269915 224494 11858 3742 1413 1365 - - 224494 11858 3742 1413 1365 0 0 21 27587 -587 - 57662 21 27587 0587 0 57662 40919 5505 20 10211 1007 32988 30554 40919 5505 20 10211 1007 32988 30554 - 81 27897 20411985 54 4494 0 81 27897 20411985 54 4494 1953 12779 9184686415 1953 12779 9184686415 2013/5/2 Anthony Damico ajdam...@gmail.com try adding colTypes = 'numeric' to your readWorkSheetFromFile() call if that doesn't work, try a few other steps # view what data types your file is being read in as sapply( temp , class ) # convert all fields to character if they're factor variables.. but i don't think you need this, readWorksheet defaults to `character` temp - sapply( temp , as.character ) # you can also convert a subset like this temp[ , c( 1 , 3:4 ) ] - sapply( temp[ , c( 1 , 3:4 ) ] , as.character ) # remove commas from character strings temp - sapply( temp , function( x ) gsub( ',' , '' , x ) ) # convert all fields to numeric temp - sapply( temp , as.numeric ) # convert all NA fields to zeroes if you prefer temp[ is.na( temp ) ] - 0 On Wed, May 1, 2013 at 11:55 PM, jpm miao miao...@gmail.com wrote: Hi, Attached are two datasheet to be read. My raw data 130502temp.xlsx contains numbers with ' symbols, and they can't be read as numbers. Even if I copy and paste as numbers to form a new file 130502temp_number1.xlsx, they could not be read smoothly. 1. How can I read the datasheet as numbers? 2. How can I treat the notation - as (1) NA or (2) zero? Thanks, Miao temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) temp Col1 Col2 Col3 Col4 1 647,853 1,413 57,662 27,897 2 491,400 1,365 40,919 20,411 3 38,604 - 5,505985 4 576 - 20 54 5 80,84521 10,211 4,494 6 36,42827 1,007 1,953 7 269,915 587 32,988 12,779 8 224,494 - 30,554 9,184 9 11,858 587 -686 10 3,742 - 81415 temp[2,2] [1] 1,365 temp[2,2]+3 Error in temp[2, 2] + 3 : non-numeric argument to binary operator temp_num-readWorksheetFromFile(130502temp_number1.xlsx, sheet=1, header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) temp_num[2,2] [1] 1,365 temp_num[2,2]+3 Error in temp_num[2, 2] + 3 : non-numeric argument to binary operator as.numeric(temp_num[2,2])+3 [1] NA Warning message: NAs introduced by coercion
Re: [R] print multiple plots to jpeg, one lattice and one ggplot2
Something like this? library(gridExtra) grid.arrange(one,two) Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA http://www.fws.gov/redbluff/rbdd_jsmp.aspx From: Christophe Bouffioux christophe@gmail.com To: r-help@r-project.org r-help@r-project.org Sent: Friday, May 3, 2013 6:33 AM Subject: [R] print multiple plots to jpeg, one lattice and one ggplot2 hello everybody, I want to print two plots in one png file, I tried several options but i didn't succeed the first plot (bwplot) print to the defined position, but the second (ggplot) doesn't Any idea? Thanks a lot Christophe # Example: #- library(ggplot2) library(lattice) library(grid) one - bwplot(decrease ~ treatment, OrchardSprays, groups = rowpos, panel = panel.superpose, panel.groups = panel.linejoin, xlab = treatment, key = list(lines = Rows(trellis.par.get(superpose.line), c(1:7, 1)), text = list(lab = as.character(unique(OrchardSprays$rowpos))), columns = 4, title = Row position)) df - data.frame(gp = factor(rep(letters[1:3], each = 10)), y = rnorm(30)) # Compute sample mean and standard deviation in each group library(plyr) ds - ddply(df, .(gp), summarise, mean = mean(y), sd = sd(y)) two - ggplot(df, aes(x = gp, y = y)) + geom_point() + geom_point(data = ds, aes(y = mean), colour = 'red', size = 3) # 1. not working jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600, height = 400, units=px, res=100) print(one, position=c(0,0,0.5,1), more=TRUE) print(two, position=c(0.5,0,1,1), ) dev.off() # 2 not working jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600, height = 400, units=px, res=100) grid.newpage() pushViewport(viewport(layout = grid.layout(1, 2))) print(one, vp = viewport(layout.pos.row = 1, layout.pos.col = 1)) # ça ne fonctionne pas print(two, vp = viewport(layout.pos.row = 1, layout.pos.col = 2)) dev.off() # 3 not working jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600, height = 400, units=px, res=100) par(mfrow=c(1,2)) one two dev.off() [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Declare a set (list?) of many dataframes or matrices
Hi, I would like to read several datasets and would like to create a set (list? sequence?) of many empty dataframes. How could this be done? How could I declare a set (list? sequence?) of many empty matrices? Thanks, Miao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Why can't R understand if(num!=NA)?
I have a program, when I write if(num!=NA) it yields an error message. However, if I write if(is.na(num)==FALSE) it works. Why doesn't the first statement work? Thanks, Miao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cURL ?
On Fri, May 3, 2013 at 11:31 AM, jawad hussain miyanja...@hotmail.com wrote: Dear Sir I tried to find cURL on web but I do not find reliable file; there are some files on http://curl.haxx.se/. But I do not know which is suitable for R and how to install? Kind Regards As usual, the OS is relevant here. What are you running? Linux package managers should be able to handle this for you. And I'd have guessed this was a Just works for OS X. MW Jawad Hussain Ashraf VPO Aroop, Tehsil and District GujranwalaMobile phone# 03016673275 Date: Sun, 28 Apr 2013 19:07:05 +0100 From: rip...@stats.ox.ac.uk To: miyanja...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] unsupported url scheme On 28/04/2013 15:32, jawad hussain wrote: fileUrl - https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOADdownload.file(fileUrl,destfile=./data/Cameras.csv,method=curl) I tried it after installing package RCurl but it give error message: Error in download.file(fileUrl, destfile = Cameras.csv) : unsupported URL schemeI can you help me to solve this problem. JAWAD HUSSAIN ASHRAF Yes, simply install a version of cURL which supports that scheme, then re-install RCurl. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. That does apply to you, too. No HTML, tell us your sessionInfo() -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MANOVA summary.manova(m) : residuals have rank
Dear All, I am trying to perform MANOVA. I have table with 504 columns(species) and 36 rows) with two grouping (season and location) Zx - Z[c(4:504)] Zxm - as.matrix(Z) m- manova(Zxm~Season*location, data=Z) when I do summary.aov, I get respond for each species but summary.manova summary.manova(m) : residuals have rank 24501. What can it be the reason for this error message? Thank you, Ozgul Below you can see part of the table. nameSeason locationAcetobacter Aerococcus Alishewanella Amaricoccus xls-nord-01 J w 0 0,024078979 0 0 bxls-sud-01 J w 0 0 0 0 brux-nord-04A w 0 0 0 0 brux-sud-04 A w 0 0 0 0 br-nord-07 Ju w 0 0 0 0 br-sud-07 Ju w 0 0 0 0 b-nord-10 O w 0 0 0 0 bsud-10 O w 0,107836089 0 0,107836089 0,035945363 Z1-01 J u 0 0 0 0,040567951 Z3-01 J u 0 0 0 0 Z5-01 J d 0,023116043 0 0 0 Z7-01 J d 0,014130281 0 0 0 Z9-01 J d 0 0 0 0 Z10-01 J d 0 0 0 0 Z12-01 J d 0 0 0 0 Z1-04 A u 0 0 0 0 Z3-04 A u 0 0 0 0 Z5-04 A d 0 0 0 0 Z7-04 A d 0 0 0 0 Z9-04 A d 0 0,013839873 0 0 Z10-04 A d 0 0 0 0 Z12-04 A d 0 0 0 0 Z1-07 Ju u 0 0 0 0 Z3-07 Ju u 0 0 0 0 Z5-07 Ju d 0 0 0 0 Z7-07 Ju d 0 0 0 0 Z9-07 Ju d 0 0 0 0 Z10-07 Ju d 0 0 0 0 Z12-07 Ju d 0 0,022301517 0 0 Z1-10 O u 0 0 0 0 Z3-10 O u 0 0 0 0 Z5-10 O d 0 0 0 0 Z7-10 O d 0 0 0,052924054 0 Z9-10 O d 0 0 0,035050824 0 Z10-10 O d 0 0 0 0,040783034 Z12-10 O d 0 0 0 0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is it a Headless problem? - Same code runs well in interactive R shell, but never terminates with Rscript
Dear R-Experts, I seem to be dealing with a so called headless problem in R. I wrote a quite extensive program that generates a Bayesian network from a query protein's Phylogenetic Tree and subsequently uses a message passing algorithm to infer the most likely annotation for the query leaf in the tree using the other leaves known -and proven- protein function annotations. The program uses the following libraries: library(tools) library(Biostrings) library(RCurl) library(stringr) library(ape) library(gRain) # gRain implements the message passing algorithm library(RMySQL) library(XML) library(parallel) library(brew) library(xtable) When the program is run from the command line as: Rscript prog.r inp.file with certain input data inp it gets stuck and does not terminate ever. Memory usage sky-rockets and the process spends almost all of its time on system calls. Using the identical R code inside an interactive R shell with the very same input data inp the script does not have any problems and finishes actually amazingly fast. I am flabbergasted and do require help. Hence my questions: * Is anything known about a problem similar to mine appearing when using the above libraries? * What is the difference -aside from the obvious missing interactiveness- between running the very same R code inside an interactive R shell or inside a file as an argument to Rscript? * Does my problem indeed fall into the headless category? The problem occurs in R version 2.15.2 (2012-10-26) -- Trick or Treat on Debian 6.0.2 uname -or gives 3.2.0-0.bpo.3-amd64 GNU/Linux Any help will be much appreciated. Have a pleasant day! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package survey: singularities in linear regression models
Well, I have uploaded the data in the public folder of my dropbox. Due to data confidentiality, I haved to change the labels. To load the data: con - url( http://dl.dropboxusercontent.com/u/101865137/datEx.rda; ) print(load(con)) # The replicate weights were created according to the jackknife (JK2) procedure in the same way as implemented in WesVar. # According to 100 JK zones, 100 replicate weights result. The replicate weights are labelled totwgtM_1 to totwgtM_100 # The regression I want to specify is achievement on group and origin. Both predictors are factors. library(survey) design - svrepdesign(data = datEx[, c(origin, group, achievement)], weights = datEx[ ,pweight], type=JKn, scale = 1, rscales = 1, repweights = datEx[,grep(^totwgtM_, colnames(datEx))], combined.weights = TRUE, mse = TRUE) # This works mod1 - svyglm(formula = achievement ~ origin + group, design = design, return.replicates = FALSE, family = gaussian(link=identity)) # I get the error message when specifying the interaction mod2 - svyglm(formula = achievement ~ origin * group, design = design, return.replicates = FALSE, family = gaussian(link=identity)) # The output of the conventional glm() function reports singularities for one coefficient of the interaction mod3 - glm(formula = achievement ~ origin * group, data = datEx, family = gaussian(link = identity)) Thanks again, Sebastian -- Sebastian Weirich, Dipl.-Psych. Institut zur Qualitätsentwicklung im Bildungswesen Humboldt-Universität zu Berlin Sitz: Hannoversche Straße 19, 10115 Berlin Postadresse: Unter den Linden 6, 10099 Berlin Tel: +49-(0)30-2093-46512 Am 02.05.2013 22:02, schrieb Thomas Lumley: On Fri, May 3, 2013 at 2:27 AM, Sebastian Weirich sebastian.weir...@iqb.hu-berlin.de mailto:sebastian.weir...@iqb.hu-berlin.de wrote: Hello, I want to specify a linear regression model in which the metric outcome is predicted by two factors and their interaction. glm() computes effects for each factor level and the levels of the interaction. In the case of singularities glm() displays NA for the corresponding coefficients. However, svyglm() aborts with an error message. Is there a possibility that svyglm() provides output for coefficients without singularities like glm()? It's not true that svyglm() aborts with an error message whenever there are singularities, eg svyglm(enroll~stype+I(stype),design=dclus1) 1 - level Cluster Sampling design With (15) clusters. svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc) Call: svyglm(formula = enroll ~ stype + I(stype), design = dclus1) Coefficients: (Intercept) stypeH stypeMI(stype)H I(stype)M 432.9697.4464.9 NA NA Degrees of Freedom: 182 Total (i.e. Null); 12 Residual Null Deviance: 2483 Residual Deviance: 1512 AIC: 2599 So, perhaps you could show us what you actually did, and what actually happened, as the posting guidelines request. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why can't R understand if(num!=NA)?
-Original Message- if(num!=NA) it yields an error message. Why doesn't the first statement work? Because you just compared something with NA (usually interpreted as 'missing') and because of that the comparison result is also NA. 'if' then tells you that you have a missing value where you need either TRUE or FALSE. Play with num!=NA #returns NA and if(NA) Not there #returns error is.na() returns TRUE for NA's, so 'if' knows what to do with the answer. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why can't R understand if(num!=NA)?
You can use only if(!is.na(num)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
On May 2, 2013, at 4:15 PM, T P Kharel wrote: I have posted a R copula question yesterday but it is not accepted yet. How long does it take? Generally moderated postings are accepted within 4-6 hours, usually sooner. I am waiting if some one can help me on my Copula package related question. Thanks I do not see any posting from a sender with a name containing the letters kharel on May 1, 2, or 3 in the archives and since I just cleared the moderation queue it was not waiting there. Some postings from non-subscribed individuals are tossed away automatically by the spam filter and are never seen by the moderators as they(we) process the moderation queue. But in your case I see that you have subscribed. I am unable to explain why your posting did not reach the list. You should be able to see whehter your psotng was received by looking at the May 2013 threads at: https://stat.ethz.ch/pipermail/r-help/ [[alternative HTML version deleted]] The HTML notice is evidence that you have not yet understood parts of the Posting Guide. PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Declare a set (list?) of many dataframes or matrices
Hello, I can't say I understand the question, but if you want a list of empty dfs and a list of empty matrices, the following will do. replicate(10, data.frame()) replicate(10, matrix(NA, nrow = 0, ncol = 0)) Hope this helps, Rui Barradas Em 03-05-2013 16:20, jpm miao escreveu: Hi, I would like to read several datasets and would like to create a set (list? sequence?) of many empty dataframes. How could this be done? How could I declare a set (list? sequence?) of many empty matrices? Thanks, Miao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why can't R understand if(num!=NA)?
A logical operation involving NA returns NA, never TRUE or FALSE: See the 8th Circle of the R Inferno (8.1.4): http://www.burns-stat.com/pages/Tutor/R_inferno.pdf num - 1 num==NA [1] NA is.na(num) [1] FALSE - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of jpm miao Sent: Friday, May 3, 2013 10:25 AM To: r-help Subject: [R] Why can't R understand if(num!=NA)? I have a program, when I write if(num!=NA) it yields an error message. However, if I write if(is.na(num)==FALSE) it works. Why doesn't the first statement work? Thanks, Miao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with reading data by readWorksheetFromFile of XLConnect Package
On May 2, 2013, at 11:00 PM, jpm miao wrote: Hi Anthony, Thank you very much. It works very well. However, after this line temp - sapply( temp , as.numeric ) the data becomes a series of numbers instead of a matrix. Is there any way to keep it a matrix? Perhaps (assuming this were a data.frame to be coerced: temp - matrix( sapply( temp , as.numeric ), dim(temp)[1]) But the persistence of the -'s is puzzling. You should (as always) have posted the output from dput(temp). Thanks, Miao temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) temp - sapply( temp , function( x ) gsub( ',' , '' , x ) ) temp Col1 Col2 Col3Col4 [1,] 647853 1413 57662 27897 [2,] 491400 1365 40919 20411 [3,] 38604 -5505 985 [4,] 576-2054 [5,] 80845 21 10211 4494 [6,] 36428 27 1007 1953 [7,] 269915 587 32988 12779 [8,] 224494 -30554 9184 [9,] 11858 587 - 686 [10,] 3742 -81415 temp - sapply( temp , as.numeric ) Warning messages: 1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion temp 647853 491400 38604576 80845 36428 269915 647853 491400 38604576 80845 36428 269915 224494 11858 3742 1413 1365 - - 224494 11858 3742 1413 1365 NA NA 21 27587 -587 - 57662 21 27587 NA587 NA 57662 40919 5505 20 10211 1007 32988 30554 40919 5505 20 10211 1007 32988 30554 - 81 27897 20411985 54 4494 NA 81 27897 20411985 54 4494 1953 12779 9184686415 1953 12779 9184686415 temp[ is.na( temp ) ] - 0 temp 647853 491400 38604576 80845 36428 269915 647853 491400 38604576 80845 36428 269915 224494 11858 3742 1413 1365 - - 224494 11858 3742 1413 1365 0 0 21 27587 -587 - 57662 21 27587 0587 0 57662 40919 5505 20 10211 1007 32988 30554 40919 5505 20 10211 1007 32988 30554 - 81 27897 20411985 54 4494 0 81 27897 20411985 54 4494 1953 12779 9184686415 1953 12779 9184686415 2013/5/2 Anthony Damico ajdam...@gmail.com try adding colTypes = 'numeric' to your readWorkSheetFromFile() call if that doesn't work, try a few other steps # view what data types your file is being read in as sapply( temp , class ) # convert all fields to character if they're factor variables.. but i don't think you need this, readWorksheet defaults to `character` temp - sapply( temp , as.character ) # you can also convert a subset like this temp[ , c( 1 , 3:4 ) ] - sapply( temp[ , c( 1 , 3:4 ) ] , as.character ) # remove commas from character strings temp - sapply( temp , function( x ) gsub( ',' , '' , x ) ) # convert all fields to numeric temp - sapply( temp , as.numeric ) # convert all NA fields to zeroes if you prefer temp[ is.na( temp ) ] - 0 On Wed, May 1, 2013 at 11:55 PM, jpm miao miao...@gmail.com wrote: Hi, Attached are two datasheet to be read. My raw data 130502temp.xlsx contains numbers with ' symbols, and they can't be read as numbers. Even if I copy and paste as numbers to form a new file 130502temp_number1.xlsx, they could not be read smoothly. 1. How can I read the datasheet as numbers? 2. How can I treat the notation - as (1) NA or (2) zero? Thanks, Miao temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) temp Col1 Col2 Col3 Col4 1 647,853 1,413 57,662 27,897 2 491,400 1,365 40,919 20,411 3 38,604 - 5,505985 4 576 - 20 54 5 80,84521 10,211 4,494 6 36,42827 1,007 1,953 7 269,915 587 32,988 12,779 8 224,494 - 30,554 9,184 9 11,858 587 -686 10 3,742 - 81415 temp[2,2] [1] 1,365 temp[2,2]+3 Error in temp[2, 2] + 3 : non-numeric argument to binary operator temp_num-readWorksheetFromFile(130502temp_number1.xlsx, sheet=1, header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) temp_num[2,2] [1] 1,365 temp_num[2,2]+3 Error in temp_num[2, 2] + 3 : non-numeric argument to binary operator as.numeric(temp_num[2,2])+3 [1] NA Warning message: NAs introduced by coercion __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] Why can't R understand if(num!=NA)?
On May 3, 2013, at 10:24 AM, jpm miao miao...@gmail.com wrote: I have a program, when I write if(num!=NA) it yields an error message. However, if I write if(is.na(num)==FALSE) it works. Why doesn't the first statement work? Thanks, Miao NA is undefined: NA == NA [1] NA NA != NA [1] NA Therefore the equality you are attempting does not return a TRUE or FALSE result, it is unknown and NA is returned. ?is.na was designed specifically to test for the presence of an NA value and return a TRUE or FALSE result which can then be tested. See: http://cran.r-project.org/doc/manuals/r-release/R-intro.html#Missing-values Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why can't R understand if(num!=NA)?
On May 3, 2013, at 8:24 AM, jpm miao wrote: I have a program, when I write if(num!=NA) it yields an error message. However, if I write if(is.na(num)==FALSE) it works. Why doesn't the first statement work? Read the manual: ?NA -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MANOVA summary.manova(m) : residuals have rank
On May 3, 2013, at 14:59 , Ozgul Inceoglu wrote: Dear All, I am trying to perform MANOVA. I have table with 504 columns(species) and 36 rows) with two grouping (season and location) Zx - Z[c(4:504)] Zxm - as.matrix(Z) m- manova(Zxm~Season*location, data=Z) when I do summary.aov, I get respond for each species but summary.manova summary.manova(m) : residuals have rank 24501. What can it be the reason for this error message? Too many columns and too few rows. Multivariate tests require more degrees of freedom than response variables. Thank you, Ozgul Below you can see part of the table. name Season locationAcetobacter Aerococcus Alishewanella Amaricoccus xls-nord-01 J w 0 0,024078979 0 0 bxls-sud-01 J w 0 0 0 0 brux-nord-04 A w 0 0 0 0 brux-sud-04 A w 0 0 0 0 br-nord-07Ju w 0 0 0 0 br-sud-07 Ju w 0 0 0 0 b-nord-10 O w 0 0 0 0 bsud-10 O w 0,107836089 0 0,107836089 0,035945363 Z1-01 J u 0 0 0 0,040567951 Z3-01 J u 0 0 0 0 Z5-01 J d 0,023116043 0 0 0 Z7-01 J d 0,014130281 0 0 0 Z9-01 J d 0 0 0 0 Z10-01J d 0 0 0 0 Z12-01J d 0 0 0 0 Z1-04 A u 0 0 0 0 Z3-04 A u 0 0 0 0 Z5-04 A d 0 0 0 0 Z7-04 A d 0 0 0 0 Z9-04 A d 0 0,013839873 0 0 Z10-04A d 0 0 0 0 Z12-04A d 0 0 0 0 Z1-07 Ju u 0 0 0 0 Z3-07 Ju u 0 0 0 0 Z5-07 Ju d 0 0 0 0 Z7-07 Ju d 0 0 0 0 Z9-07 Ju d 0 0 0 0 Z10-07Ju d 0 0 0 0 Z12-07Ju d 0 0,022301517 0 0 Z1-10 O u 0 0 0 0 Z3-10 O u 0 0 0 0 Z5-10 O d 0 0 0 0 Z7-10 O d 0 0 0,052924054 0 Z9-10 O d 0 0 0,035050824 0 Z10-10O d 0 0 0 0,040783034 Z12-10O d 0 0 0 0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why can't R understand if(num!=NA)?
On 03-05-2013, at 17:24, jpm miao miao...@gmail.com wrote: I have a program, when I write if(num!=NA) it yields an error message. it? What is unclear about the error message? However, if I write if(is.na(num)==FALSE) it works. Why doesn't the first statement work? Read section 2.5 'Missing values of the manual An Introduction to R. Berend Thanks, Miao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read .csv file and plot a graph
Just read in and plot the data. The NaN will not be plotted: input - read.table(text = x + 1 NaN + 2 NaN + 3 0.23 + 4 .34 + 5 .55 + 6 .66 + 7 NaN + 8 .88, header = TRUE) plot(input$x) On Fri, May 3, 2013 at 9:49 AM, Vahe nr vne...@gmail.com wrote: Hi all, I have a big .csv file (21Mb with 100 rows) it has this shape: x 1 NaN 2 NaN 3 0.23 and so on. So the first column has x as a header then row number, the second column contains values between -1,1 and NaN for empty values. What should I need to do is: create a new .csv file from this one excluding NaN values and plot a line graph using the new .csv file. Or can I use the old .csv file to plot a graph excluding NaN values. Thanks in advance for any help or suggestions. Regards, Vahe [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Declare a set (list?) of many dataframes or matrices
Hi, I am not sure about what you meant. lapply(1:5,function(i) data.frame()) [[1]] data frame with 0 columns and 0 rows [[2]] data frame with 0 columns and 0 rows [[3]] data frame with 0 columns and 0 rows [[4]] data frame with 0 columns and 0 rows [[5]] data frame with 0 columns and 0 rows A.K. - Original Message - From: jpm miao miao...@gmail.com To: r-help r-help@r-project.org Cc: Sent: Friday, May 3, 2013 11:20 AM Subject: [R] Declare a set (list?) of many dataframes or matrices Hi, I would like to read several datasets and would like to create a set (list? sequence?) of many empty dataframes. How could this be done? How could I declare a set (list? sequence?) of many empty matrices? Thanks, Miao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why can't R understand if(num!=NA)?
num1- c(0,NA,1,3) num1==NA #[1] NA NA NA NA num1!=NA #[1] NA NA NA NA is.na(num1) #[1] FALSE TRUE FALSE FALSE A.K. - Original Message - From: jpm miao miao...@gmail.com To: r-help r-help@r-project.org Cc: Sent: Friday, May 3, 2013 11:24 AM Subject: [R] Why can't R understand if(num!=NA)? I have a program, when I write if(num!=NA) it yields an error message. However, if I write if(is.na(num)==FALSE) it works. Why doesn't the first statement work? Thanks, Miao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change Selected Variables from Numeric to Factors
Hi ST, Try this: set.seed(51) df1- as.data.frame(matrix(sample(1:40,60,replace=TRUE),ncol=10)) df2- df1 check- c(V3,V7,V9) df1[,match(check,colnames(df1))]-lapply(df1[,match(check,colnames(df1))],as.factor) str(df1) #'data.frame': 6 obs. of 10 variables: # $ V1 : int 32 9 12 40 9 34 # $ V2 : int 31 17 39 5 21 28 # $ V3 : Factor w/ 6 levels 1,6,7,10,..: 3 5 1 6 2 4 # $ V4 : int 26 4 8 18 39 2 # $ V5 : int 39 21 4 26 6 21 # $ V6 : int 27 33 35 8 17 8 # $ V7 : Factor w/ 5 levels 4,8,9,24,..: 2 3 4 1 3 5 # $ V8 : int 4 12 12 32 13 37 # $ V9 : Factor w/ 5 levels 10,31,33,..: 1 4 2 3 5 5 # $ V10: int 13 26 20 22 14 5 #or df2[check]- lapply(check,function(x) as.factor(df2[[x]])) # str(df2) #'data.frame': 6 obs. of 10 variables: # $ V1 : int 32 9 12 40 9 34 # $ V2 : int 31 17 39 5 21 28 # $ V3 : Factor w/ 6 levels 1,6,7,10,..: 3 5 1 6 2 4 # $ V4 : int 26 4 8 18 39 2 # $ V5 : int 39 21 4 26 6 21 # $ V6 : int 27 33 35 8 17 8 # $ V7 : Factor w/ 5 levels 4,8,9,24,..: 2 3 4 1 3 5 # $ V8 : int 4 12 12 32 13 37 # $ V9 : Factor w/ 5 levels 10,31,33,..: 1 4 2 3 5 5 # $ V10: int 13 26 20 22 14 5 A.K. I have a dataframe df with several columns. I need to change some of these to factors. What colums I need to change to factors is in another vector check. I am using this command sapply(check , function(x) df[[x]] - as.factor(df[[x]])) But this is not working. Can someone please advise. Thanks. -ST __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with reading data by readWorksheetFromFile of XLConnect Package
you can also try: temp[] - lapply(temp, as.numeric) On Fri, May 3, 2013 at 11:54 AM, David Winsemius dwinsem...@comcast.netwrote: On May 2, 2013, at 11:00 PM, jpm miao wrote: Hi Anthony, Thank you very much. It works very well. However, after this line temp - sapply( temp , as.numeric ) the data becomes a series of numbers instead of a matrix. Is there any way to keep it a matrix? Perhaps (assuming this were a data.frame to be coerced: temp - matrix( sapply( temp , as.numeric ), dim(temp)[1]) But the persistence of the -'s is puzzling. You should (as always) have posted the output from dput(temp). Thanks, Miao temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) temp - sapply( temp , function( x ) gsub( ',' , '' , x ) ) temp Col1 Col2 Col3Col4 [1,] 647853 1413 57662 27897 [2,] 491400 1365 40919 20411 [3,] 38604 -5505 985 [4,] 576-2054 [5,] 80845 21 10211 4494 [6,] 36428 27 1007 1953 [7,] 269915 587 32988 12779 [8,] 224494 -30554 9184 [9,] 11858 587 - 686 [10,] 3742 -81415 temp - sapply( temp , as.numeric ) Warning messages: 1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion 5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion temp 647853 491400 38604576 80845 36428 269915 647853 491400 38604576 80845 36428 269915 224494 11858 3742 1413 1365 - - 224494 11858 3742 1413 1365 NA NA 21 27587 -587 - 57662 21 27587 NA587 NA 57662 40919 5505 20 10211 1007 32988 30554 40919 5505 20 10211 1007 32988 30554 - 81 27897 20411985 54 4494 NA 81 27897 20411985 54 4494 1953 12779 9184686415 1953 12779 9184686415 temp[ is.na( temp ) ] - 0 temp 647853 491400 38604576 80845 36428 269915 647853 491400 38604576 80845 36428 269915 224494 11858 3742 1413 1365 - - 224494 11858 3742 1413 1365 0 0 21 27587 -587 - 57662 21 27587 0587 0 57662 40919 5505 20 10211 1007 32988 30554 40919 5505 20 10211 1007 32988 30554 - 81 27897 20411985 54 4494 0 81 27897 20411985 54 4494 1953 12779 9184686415 1953 12779 9184686415 2013/5/2 Anthony Damico ajdam...@gmail.com try adding colTypes = 'numeric' to your readWorkSheetFromFile() call if that doesn't work, try a few other steps # view what data types your file is being read in as sapply( temp , class ) # convert all fields to character if they're factor variables.. but i don't think you need this, readWorksheet defaults to `character` temp - sapply( temp , as.character ) # you can also convert a subset like this temp[ , c( 1 , 3:4 ) ] - sapply( temp[ , c( 1 , 3:4 ) ] , as.character ) # remove commas from character strings temp - sapply( temp , function( x ) gsub( ',' , '' , x ) ) # convert all fields to numeric temp - sapply( temp , as.numeric ) # convert all NA fields to zeroes if you prefer temp[ is.na( temp ) ] - 0 On Wed, May 1, 2013 at 11:55 PM, jpm miao miao...@gmail.com wrote: Hi, Attached are two datasheet to be read. My raw data 130502temp.xlsx contains numbers with ' symbols, and they can't be read as numbers. Even if I copy and paste as numbers to form a new file 130502temp_number1.xlsx, they could not be read smoothly. 1. How can I read the datasheet as numbers? 2. How can I treat the notation - as (1) NA or (2) zero? Thanks, Miao temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) temp Col1 Col2 Col3 Col4 1 647,853 1,413 57,662 27,897 2 491,400 1,365 40,919 20,411 3 38,604 - 5,505985 4 576 - 20 54 5 80,84521 10,211 4,494 6 36,42827 1,007 1,953 7 269,915 587 32,988 12,779 8 224,494 - 30,554 9,184 9 11,858 587 -686 10 3,742 - 81415 temp[2,2] [1] 1,365 temp[2,2]+3 Error in temp[2, 2] + 3 : non-numeric argument to binary operator temp_num-readWorksheetFromFile(130502temp_number1.xlsx, sheet=1, header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5) temp_num[2,2] [1] 1,365 temp_num[2,2]+3 Error in temp_num[2, 2] + 3 : non-numeric argument to binary operator
Re: [R] Why can't R understand if(num!=NA)?
if(num!=NA) Why doesn't the first statement work? An NA value means that the value is unknown. E.g., age - NA means the you do not know the age of your subject. (The subject has an age, NA means you did not collect that data.) Thus you do not know the value of age == 6 either, the subject might be 6 or it might not be. Hence R makes the value of age==6 NA. Since R does not have different evaluation rules for literal values and expressions that means that NA==6 and NA==someAge must evaluate to NA as well. The second part of the question is why if (NA) { } else { } causes an error. It is a bit arbitrary, but there is a mismatch between a 2-way 'if' statement and 3-valued logical data and R deals with it by insisting that the condition in if (condition) { } else {} be either TRUE or FALSE, not NA. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of jpm miao Sent: Friday, May 03, 2013 8:25 AM To: r-help Subject: [R] Why can't R understand if(num!=NA)? I have a program, when I write if(num!=NA) it yields an error message. However, if I write if(is.na(num)==FALSE) it works. Why doesn't the first statement work? Thanks, Miao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why can't R understand if(num!=NA)?
At a minimum, the first statement needs ==. Also, is.na() gives TRUE/FALSE. While a logical comparison to NA gives NA as a value. Kevin On Fri, May 3, 2013 at 10:24 AM, jpm miao miao...@gmail.com wrote: I have a program, when I write if(num!=NA) it yields an error message. However, if I write if(is.na(num)==FALSE) it works. Why doesn't the first statement work? Thanks, Miao [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Kevin Wright [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R does not subset
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Katarzyna Kulma Sent: Friday, May 03, 2013 4:21 AM To: David Kulp Cc: r-help@r-project.org Subject: Re: [R] R does not subset Jorge, thanks for your suggestions, but they give the same (empty) result: RECinf-subset(REC2, INFECTION==Infected) head(RECinf) [1] RINGNOyear ccFLEDGE rec2012 binageINFECTION all.rsLD 0 rows (or 0-length row.names) but David's suggestion worked! : RECinf-REC2[REC2$INFECTION==Infected ,] head(RECinf) RINGNO year ccFLEDGE rec2012 binage INFECTION all.rsLD 2 BX23298 Y20036 1juv Infected -6.1938776 4 BT53646 Y20035 2 ad Infected -4.1938776 7 BT53248 Y20036 1 ad Infected -2.1938776 11 BY75833 Y20045 0 ad Infected -4.6574803 13 BX23067 Y20046 0 ad Infected -3.6574803 17 BX24240 Y20046 0 ad Infected 0.3425197 still not sure why the subset() function didn't work, though. Thanks for your help! Maybe it didn't work because you still didn't have a space at the end of the value you were comparing (apparently the factor was defined with a space). with). Try the following (and notice the space at the end of Infected . RECinf-subset(REC2, INFECTION==Infected ) David's suggestion worked because you did include a space there. Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting number of consecutive occurrences per rows
Hi, I'm sorry that it takes me so much time to respond, finally yesterday I got time to try your suggestions. Thank you for them! I tried both, they give the same results, but in both there are some things I still need to solve. I would appreciate your help. I include a little bigger dataframe (test2, in the end of this email), with more differencies in variables, to be able to better explain what I would like to calculate in addition. *Jim's code:* I needed to make some changes in assigning the key. Yours worked ok for that small test data, but when I tried it on my dataframe which has around 25000rows, it didn't work properly. test2$key[test2$act == 0] - 1 test2$key[test2$act 0 test2$act 200] - 2 test2$key[test2$act == 200] - 3 # this works ok test2$resChange - cumsum(c(1, abs(diff(test2$key test2$res - ave(test2$resChange, test2$resChange, FUN = length) # I added new column by jul date test2$resJ - ave(test2$resChange, test2$resChange, test2$juln, FUN = length) # this works fine as well, for dividing between day 0 and day 1 test2$resJD - ave(test2$resChange, test2$resChange, test2$juln, test2$day, FUN = length) # resume test2Resume - test2[ , list(maxres = max(res) , minres = min(res) , sumres = length(unique(resChange))) , keyby = c('day', 'key')] # change 'key' test2Resume_day$key - c('0', '1-199', '200')[test2Resume_day$key] test2Resume_day day key maxres minres sumres 1: 0 0 2 2 3 2: 0 1-199 3 1 9 3: 0 200 6 1 7 4: 1 0 1 1 1 5: 1 1-199 10 1 7 6: 1 200 6 1 6 # resume by juln test2Resume_jul - test2[ , list(maxres = max(res) , minres = min(res) , sumres = length(unique(resChange))) , keyby = c('juln', 'key')] # by juln # change 'key' test2Resume_jul$key - c('0', '1-199', '200')[test2Resume_jul$key] test2Resume_jul juln key maxres minres sumres 1: 15173 0 2 2 1 2: 15173 1-199 3 1 7 3: 15173 200 6 1 6 4: 15174 0 2 1 3 5: 15174 1-199 10 1 8 6: 15174 200 6 1 6 It is ok, but what I would like to get is resume for juln and for variable day (0 and 1) aswell. Like this: juln day key maxres minressumres 15173 00 15173 01-199 15173 0200 15173 10 15173 11-199 15173 1200 15174 0 0 15174 0 1-199 15174 0 200 15174 1 0 15174 1 1-199 15174 1 200 ... The other thing is that the sumres I would like to calculate like a sum of values of occurencies for each key. For example, if in the test2 dataframe res values for key 200 (juln 15173) are 1, 1, 2,2,1,2 the sumres should be 9 (1+1+2+2+1+2), not 6 (which I suppose come form sum of number of unique occurencies). *Petr's code:* This works fine also, the thing is that doing the aggregation I would need the intervals to be like this [0, 1) [1, 199] (199, 200] what I don't know if is possible... I checked the hepl for cut, but I found that it can be closed just right or left... Thank you very much for your time and sharing your knowledge! Zuzana ## here is the bigger test2 dataframe dput(test2) structure(list(daten = structure(c(15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174), class = Date), juln = c(15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174), fen = c(win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win, win), night = structure(c(1310962792, 1310963392, 1310963992, 1310964592, 1310965192, 1310965792, 1310966392, 1310966992, 1310967592, 1310968192, 1310968792, 1310969392, 1310969992, 1310970592, 1310971192, 1310971792, 1310972392, 1310972992, 1310973592,
Re: [R] Why can't R understand if(num!=NA)?
On May 3, 2013, at 17:24 , jpm miao wrote: I have a program, when I write if(num!=NA) it yields an error message. However, if I write if(is.na(num)==FALSE) it works. Why doesn't the first statement work? Because comparison with an unknown value yields an unknown result. By the way, comparing a logical value to FALSE is silly: if ( !is.na(num) ) will do it. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating distance matrix for large dataset
On 03.05.2013 15:36, David Carlson wrote: Here's the result on R 3.0.0 64 bit under Windows 8: A-matrix(1:365000*144,nrow=365000,ncol=144) dim(A) [1] 365000144 d - dist(mydata_nor, method = euclidean) Error in as.matrix(x) : object 'mydata_nor' not found d - dist(A, method = euclidean) Error: cannot allocate vector of size 496.3 Gb In addition: Warning messages: 1: In dist(A, method = euclidean) : Reached total allocation of 8078Mb: see help(memory.size) 2: In dist(A, method = euclidean) : Reached total allocation of 8078Mb: see help(memory.size) 3: In dist(A, method = euclidean) : Reached total allocation of 8078Mb: see help(memory.size) 4: In dist(A, method = euclidean) : Reached total allocation of 8078Mb: see help(memory.size) Your message suggests that your system could not accurately compute the requirements. Unless you have access to a computer with 500 gigabytes, you need to consider alternate approaches such as aggregating the data into longer time blocks or using kmeans. Or to show how we can calculate it: Or simpler speaking, you need to calculate 365000 * (365000-1) / 2 = 66612317500 distances and with 8 bytes each, hence you need 66612317500 * 8 = 53289854 Bytes = 53289854 / (1024)^3 GB ~= 496.3 Gb to store it in memory. Best, Uwe Ligges - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of HJ YAN Sent: Thursday, May 2, 2013 6:02 PM To: r-help@r-project.org Subject: [R] Calculating distance matrix for large dataset Dear R users I wondered if any of you ever tried to calculate distance matrix with very large data set, and if anyone out there can confirm this error message I got actually mean that my data is too large for this task. negative length vectors are not allowed My data size and code used dim(mydata_nor)[1] 365000144 d - dist(mydata_nor, method = euclidean) Here my data has 1000 samples each has a year data observed by 10 minutes interval daily, so the size is (365* 1000) * 144. I checked the manual of function 'dist' but can not see the upper limit size allowed, and I bet there should be one, so any hints is appreciated. I would also be grateful if any other method for calculating distance matrix for large dataset could be advised. I appreciate reproducible code should be provided for your advice, so try below if needed: A-matrix(1:365000*144,nrow=365000,ncol=144) dim(A)[1] 365000144 d1-dist(A,method=euclidean)Error in dist(A, method = euclidean) : negative length vectors are not allowed Many thanks in advance! HJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Empirica Copula
Dear users I am reposting this and hope it will be accepted this time. I am using copula package to fit my bivariate data and simulation. As explained in package documentation we can use our own data distribution to feed on copula as long as we have d, p and q (pdf, cdf and quantile) functions are available. Hence my code for those are: # Make the functions for data distribution dSAR-function(SAR){dexp(SAR, rate=0.5)} pSAR-function(SAR){pexp(SAR, rate=0.5)} qSAR-function(SAR){qexp(c(seq(0,1, .01)),SAR, rate=0.5)} dper-function(per) {dexp(per,rate=0.5)} pper-function(per){pexp(per,rate=0.5)} qper-function(per){qexp(c(seq(0,1,.01)),per, rate=0.5)} gmb-gumbelCopula(3,dim=2) # create bivariate copula object with dim=2 #tau(gmb) ## construct a bivariate distribution with defined marginals myCDF- mvdc(gmb, margins=c(exp,exp), paramMargins=list(list(rate=0.5),list(rate=0.5))) # Use own data for bivariate CDF construction myCDF2- mvdc(gmb, margins=c(SAR,per), paramMargins=list(list(rate=.5),list(rate=.5))) # Generate (bivariate) random numbers from that, and visualize x - rMvdc(1000, myCDF2) And I get error message everytime as: x - rMvdc(1000, myCDF2) Error in qSAR(x, rate = 0.5) : unused argument(s) (rate = 0.5) It works fine with myCDF and generate bivariate data: x - rMvdc(1000, myCDF2) But my problem is simulated data (using myCDF) does not show the same relationship as in original data. Hence I want to use my own empirical distribution (myCDF2) to simulate data. It looks like it is not taking the quantile function, qSAR. Is there any other way I can define my data distribution and feed to copula ? Thanks for help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R does not subset
Thi is great! Thank you so much for taking the time to give is this hint.  mike From: Jeff Newmiller jdnew...@dcn.davis.ca.us To:Mihai Nica mihain...@yahoo.com; Mihai Nica mihain...@yahoo.com; Jorge I Velez jorgeivanve...@gmail.com; Katarzyna Kulma katarzyna.ku...@gmail.com Cc: R mailing list r-help@r-project.org Sent: Friday, May 3, 2013 8:16 AM Subject: Re: [R] R does not subset This typically occurs because of sloppy manual data entry outside of R. To relieve further analysis pain, you can manually clean the data (usually only effective for one-time analyses) or use R to fix problems right after loading the data (there are multiple methods for doing this... I prefer using ?sub on character data before creating the factor). --- Jeff Newmiller            The  .   . Go Live... DCN:jdnew...@dcn.davis.ca.us    Basics: ##.#.   ##.#. Live Go...                    Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries      O.O#.   #.O#. with /Software/Embedded Controllers)       .OO#.   .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Mihai Nica mihain...@yahoo.com wrote: Hi: (note the space after Infected) Since I lost a morning too with this issue, I am just curious, why is there a space?� I know, it must be a dumb question, a reasonable programming rule, but that's my level :-) � mike From: Jorge I Velez jorgeivanve...@gmail.com To:Katarzyna Kulma katarzyna.ku...@gmail.com Cc: R mailing list r-help@r-project.org Sent: Friday, May 3, 2013 6:01 AM Subject: Re: [R] R does not subset Hi Kasia, You need subset(REC2,� INFECTION==Infected ) (note the space after Infected). HTH, Jorge.- On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma katarzyna.ku...@gmail.comwrote: Hi everyone, I know there have been several requests regarding subsetting before, but none of them really helps with my problem: I'm trying to subset only infected individuals from the REC2 data.frame: str(REC2) 'data.frame':� � 362 obs. of� 7 variables: � $ RINGNO� : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58 66 17 � $ year� � : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3 ... � $ ccFLEDGE : int� 6 6 6 5 6 7 6 7 6 5 ... � $ rec2012� : int� 2 1 2 2 1 2 1 1 1 0 ... � $ binage� : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ... � $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2 2 1 ... � $ all.rsLD : num� -4.62 -6.19 -3.62 -4.19 -2.62 ... using either RECinf-REC2[which (REC2$INFECTION==Infected),] or RECinf-subset(REC2,� INFECTION==Infected) in both cases I get empty data frame (0 observations): str(RECinf) 'data.frame':� � 0 obs. of� 7 variables: � $ RINGNO� : Factor w/ 370 levels BL17546,BL17577,..: � $ year� � : Factor w/ 8 levels Y2002,Y2003,..: � $ ccFLEDGE : int � $ rec2012� : int � $ binage� : Factor w/ 2 levels ad,juv: � $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : � $ all.rsLD : num When subsetting, R doesn't return any warning or error message. Besides, I used same codes many times beforeand they worked perfectly well. Any ideas why this case is different? Thanks for your help, Kasia � � � � [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ��� [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.    [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create and read symbolic links in Windows
Thanks for your suggestion... I upgraded to R.3.0.0 in 64-bit Windows 7 environment.. This time when I use file.link.. I get the following error message: 'Cannot create a file when that file already exists And I don't see the link. The other function, file.copy, correctly copies to the target location. Still confuse with the error msges... Thanks, Santosh On Thu, May 2, 2013 at 11:42 PM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote: On 03/05/2013 07:33, Santosh wrote: Thanks for the suggestions. In windows (Windows 7, 64-bit), I couldn't get file.symlink to work, but file.link did return the result to be TRUE but at the target location, I did not see any link. Not sure I am missing anything more.. Hope it's nothing to do with administrator accounts and administrator rights... Is it something I should check with my system administrator? You may need to update your R: although the posting guide asked you to do that before posting. There was a relevant bug fix in 2.15.3. Thanks, Santosh On Thu, May 2, 2013 at 12:22 PM, Prof Brian Ripley rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk** wrote: On 02/05/2013 19:50, Santosh wrote: Dear Rxperts.. Got a couple of quick q's.. I am using R in windows environment (both 32-bit and 64-bit) a) Is there a way to create symbolic links to some data files? See ?file.symlink. ??'symbolic link' should have got you there. Note that this is not very useful for files, but that is a Windows and not an R restriction. b) How do I read data from symbolic links? The same ways you read data from files. Thanks so much.. Santosh -- Brian D. Ripley, rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~__**ripley/http://www.stats.ox.ac.uk/~__ripley/ http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 tel:%2B44%201865%20272861 (self) 1 South Parks Road, +44 1865 272866 tel:%2B44%201865%20272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 tel:%2B44%201865%20272595 __**__ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/_**_listinfo/r-helphttps://stat.ethz.ch/mailman/__listinfo/r-help https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/__**posting-guide.htmlhttp://www.R-project.org/__posting-guide.html http://www.R-project.org/**posting-guide.htmlhttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create and read symbolic links in Windows
Just got it right please ignore the previous posting... It worked! Prof Ripley made my day!! :) THANK YOU! On Fri, May 3, 2013 at 11:23 AM, Santosh santosh2...@gmail.com wrote: Thanks for your suggestion... I upgraded to R.3.0.0 in 64-bit Windows 7 environment.. This time when I use file.link.. I get the following error message: 'Cannot create a file when that file already exists And I don't see the link. The other function, file.copy, correctly copies to the target location. Still confuse with the error msges... Thanks, Santosh On Thu, May 2, 2013 at 11:42 PM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote: On 03/05/2013 07:33, Santosh wrote: Thanks for the suggestions. In windows (Windows 7, 64-bit), I couldn't get file.symlink to work, but file.link did return the result to be TRUE but at the target location, I did not see any link. Not sure I am missing anything more.. Hope it's nothing to do with administrator accounts and administrator rights... Is it something I should check with my system administrator? You may need to update your R: although the posting guide asked you to do that before posting. There was a relevant bug fix in 2.15.3. Thanks, Santosh On Thu, May 2, 2013 at 12:22 PM, Prof Brian Ripley rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk** wrote: On 02/05/2013 19:50, Santosh wrote: Dear Rxperts.. Got a couple of quick q's.. I am using R in windows environment (both 32-bit and 64-bit) a) Is there a way to create symbolic links to some data files? See ?file.symlink. ??'symbolic link' should have got you there. Note that this is not very useful for files, but that is a Windows and not an R restriction. b) How do I read data from symbolic links? The same ways you read data from files. Thanks so much.. Santosh -- Brian D. Ripley, rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~__**ripley/http://www.stats.ox.ac.uk/~__ripley/ http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 tel:%2B44%201865%20272861 (self) 1 South Parks Road, +44 1865 272866 tel:%2B44%201865%20272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 tel:%2B44%201865%20272595 __**__ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/_**_listinfo/r-helphttps://stat.ethz.ch/mailman/__listinfo/r-help https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/__**posting-guide.htmlhttp://www.R-project.org/__posting-guide.html http://www.R-project.org/**posting-guide.htmlhttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why can't R understand if(num!=NA)?
On May 3, 2013, at 17:24 , jpm miao wrote: I have a program, when I write if(num!=NA) snipped On May 3, 2013, at 10:46 AM, peter dalgaard wrote: Because comparison with an unknown value yields an unknown result. Anything else would violate the Second Law of Thermodynamics. We cannot have comparisons reducing entropy, now can we? Uncertainty cannot run uphill. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fortune candidate! Re: Why can't R understand if(num!=NA)?
On Fri, May 3, 2013 at 3:36 PM, David Winsemius dwinsem...@comcast.net wrote: On May 3, 2013, at 17:24 , jpm miao wrote: I have a program, when I write if(num!=NA) snipped On May 3, 2013, at 10:46 AM, peter dalgaard wrote: Because comparison with an unknown value yields an unknown result. Anything else would violate the Second Law of Thermodynamics. We cannot have comparisons reducing entropy, now can we? Uncertainty cannot run uphill. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
Hi. After I installed R 3.0.0.pkg for mac version , when click the icon R to startup . I receive the annoucement in red color to inform that something wrongs , but I do not know how to fix them . R version 3.0.0 (2013-04-03) -- Masked Marvel Copyright (C) 2013 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin10.8.0 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. During startup - Warning messages: 1: Setting LC_CTYPE failed, using C 2: Setting LC_COLLATE failed, using C 3: Setting LC_TIME failed, using C 4: Setting LC_MESSAGES failed, using C 5: Setting LC_PAPER failed, using C [R.app GUI 1.60 (6476) x86_64-apple-darwin10.8.0] WARNING: You're using a non-UTF8 locale, therefore only ASCII characters will work. Please read R for Mac OS X FAQ (see Help) section 9 and adjust your system preferences accordingly. [History restored from /Users/dinhtientrung/.Rapp.history] starting httpd help server ... done Would you mind sharing your experiences in this situation for me please ! Thank you so much . Hope to hear the answer from you soon Trung [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why can't R understand if(num!=NA)?
On May 3, 2013, at 10:46 AM, peter dalgaard wrote: On May 3, 2013, at 17:24 , jpm miao wrote: I have a program, when I write if(num!=NA) it yields an error message. However, if I write if(is.na(num)==FALSE) it works. Why doesn't the first statement work? Because comparison with an unknown value yields an unknown result. Anything else would violate the Second Law of Thermodynamics. We cannot have comparisons reducing entropy, now can we? Uncertainty cannot run uphill. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
On May 3, 2013, at 9:44 AM, Tien trung Dinh wrote: Hi. After I installed R 3.0.0.pkg for mac version , when click the icon R to startup . I receive the annoucement in red color to inform that something wrongs , but I do not know how to fix them . R version 3.0.0 (2013-04-03) -- Masked Marvel Copyright (C) 2013 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin10.8.0 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. During startup - Warning messages: 1: Setting LC_CTYPE failed, using C 2: Setting LC_COLLATE failed, using C 3: Setting LC_TIME failed, using C 4: Setting LC_MESSAGES failed, using C 5: Setting LC_PAPER failed, using C [R.app GUI 1.60 (6476) x86_64-apple-darwin10.8.0] WARNING: You're using a non-UTF8 locale, therefore only ASCII characters will work. Please read R for Mac OS X FAQ (see Help) section 9 and adjust your system preferences accordingly. [History restored from /Users/dinhtientrung/.Rapp.history] starting httpd help server ... done Would you mind sharing your experiences in this situation for me please ! Why have you stopped at this point? (My experiences have bee quite good with following advice.) You have been given a very specific warning (not an error). It is telling you where to find additional information. It is your responsibility to educate yourself further. The document referred to can be found by pulling down the Help menu (while running R.app) and choosing R Help. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] color by group in ggplot
Hey, I have a dataset like this: ID Var1 Var2 Group A1 11BB A2 1 2AA B1 2 1 CC B2 13DD C1 12EE I would like to plot the points of Var1 and Var2, use ID as X-axis, but color the points by Group. I can only manage to color the points by ID after transform the dataset to tall using reshape package. Thanks for your help! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] selecting certain rows from data frame
Hi, You can use ?split() lst1-split(DF,DF$ID) lst1[1:2] #$`1` # ID drugs month #1 1 drug x 1 #4 1 drug x 1 #5 1 drug y 2 #6 1 drug z 3 # #$`2` # ID drugs month #2 2 drug y 2 #7 2 drug x 1 mean(sapply(lst1,nrow)) #[1] 2.4 #or library(plyr) mean(ddply(DF,.(ID),nrow)[,2]) #[1] 2.4 #or mean(with(DF,tapply(ID,ID,FUN=length))) #[1] 2.4 A.K. From: Sarah Jo Sinnott 105405...@umail.ucc.ie To: arun smartpink...@yahoo.com Sent: Friday, May 3, 2013 4:35 PM Subject: Re: selecting certain rows from data frame Yes - but if I can count the number of rows for each ID, this equates to number of drugs per each ID. So that way I can get a mean #rows(drugs). e.g., ID 1 = 4 rows (approx=4drugs) ID2= 2 rows ID 3 = 3 rows ID 4 = 2 rows ID 5 = 1 row 12 rows/5people = 2.4rows/person that is 2.4 drugs per person. Do you think it is possible to isolate the number of rows per unique ID? It would be great if you could! I'v etried reorganising my data into wide format - but it doesn't work very well, so I'm left with his option really! Thank you for you help thus far __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color by group in ggplot
On May 3, 2013, at 1:37 PM, Ye Lin wrote: Hey, I have a dataset like this: ID Var1 Var2 Group A1 11BB A2 1 2AA B1 2 1 CC B2 13DD C1 12EE I would like to plot the points of Var1 and Var2, use ID as X-axis, but color the points by Group. I can only manage to color the points by ID after transform the dataset to tall using reshape package. If I were given the task of designing a plotting system that would decide what to do with a categorical x-axis request, it would probably deliver a barplot. My guess is that you do not want that. But what do you mean by a point whose x-value is A1? -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color by group in ggplot
I want to plot the values of Var1 and Var2 on the same plot, with x-axis labeling as the list of IDs. But I want to color the points by their category in Group. Is it possible to do in ggplot, or do i have to plot from scratch using basic plot? On Fri, May 3, 2013 at 1:49 PM, David Winsemius dwinsem...@comcast.netwrote: On May 3, 2013, at 1:37 PM, Ye Lin wrote: Hey, I have a dataset like this: ID Var1 Var2 Group A1 11BB A2 1 2AA B1 2 1 CC B2 13DD C1 12EE I would like to plot the points of Var1 and Var2, use ID as X-axis, but color the points by Group. I can only manage to color the points by ID after transform the dataset to tall using reshape package. If I were given the task of designing a plotting system that would decide what to do with a categorical x-axis request, it would probably deliver a barplot. My guess is that you do not want that. But what do you mean by a point whose x-value is A1? -- David Winsemius Alameda, CA, USA attachment: image.png__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color by group in ggplot
HI, May be this helps: dat1- read.table(text= ID Var1 Var2 Group A1 1 1 BB A2 1 2 AA B1 2 1 CC B2 1 3 DD C1 1 2 EE ,sep=,header=TRUE) library(reshape2) dat2-melt(dat1,id.var=c(ID,Group)) library(ggplot2) ggplot(dat2,aes(x=ID,y=value,group=Group,colour=Group))+geom_point() A.K. - Original Message - From: Ye Lin ye...@lbl.gov To: R help r-help@r-project.org Cc: Sent: Friday, May 3, 2013 4:37 PM Subject: [R] color by group in ggplot Hey, I have a dataset like this: ID Var1 Var2 Group A1 1 1 BB A2 1 2 AA B1 2 1 CC B2 1 3 DD C1 1 2 EE I would like to plot the points of Var1 and Var2, use ID as X-axis, but color the points by Group. I can only manage to color the points by ID after transform the dataset to tall using reshape package. Thanks for your help! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color by group in ggplot
Thanks A.K I also add shape=variable so that it is much easier to tell two variables by color +shape On Fri, May 3, 2013 at 2:14 PM, arun smartpink...@yahoo.com wrote: HI, May be this helps: dat1- read.table(text= IDVar1 Var2Group A111BB A21 2AA B1 2 1CC B213DD C1 12EE ,sep=,header=TRUE) library(reshape2) dat2-melt(dat1,id.var=c(ID,Group)) library(ggplot2) ggplot(dat2,aes(x=ID,y=value,group=Group,colour=Group))+geom_point() A.K. - Original Message - From: Ye Lin ye...@lbl.gov To: R help r-help@r-project.org Cc: Sent: Friday, May 3, 2013 4:37 PM Subject: [R] color by group in ggplot Hey, I have a dataset like this: ID Var1 Var2 Group A1 11BB A2 1 2AA B1 2 1 CC B2 13DD C1 12EE I would like to plot the points of Var1 and Var2, use ID as X-axis, but color the points by Group. I can only manage to color the points by ID after transform the dataset to tall using reshape package. Thanks for your help! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector allocation problem while trying to plot 6 MB data file
On 02.05.2013 14:37, Ramon Hofer wrote: Hi all I'm trying to analyse the network speed and used iperf to create a csv file containing the link test data. It's only about 6 MB big but contains about 40'000 samples. I can do boxplots (apart from printing the number of samples but I ask separately for that). To find the behaviour over time I wanted to plot the throuphput. So I have this command: plot(A$Timestamp, A$Bandwidth.bit.sec., xlab = Timestamp, ylab = Bandwidth [bit/s], ylim = quantile(A$Bandwidth.bit.sec., c(0, .99), na.rm = TRUE)) Unfortunately I get this: Error: cannot allocate vector of size 12.5 Gb 4 samples and 6MB can't be the issue unless this is not a regular plot but the classes of A$Timestamp or A$Bandwidth.bit.sec are rather special. What do str(A$Timestamp) str(A$Bandwidth.bit.sec.) tell us? Can you make a reprducible examples available? Best, Uwe Ligges Is there a way around this problem or will I have to split the data? Best Ramon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Write date class as number of days from 1970
On 03.05.2013 15:59, Manta wrote: Dear all, I have a dataset with one column being of class Date. When I write the output, I would like that column being written as number of days from 1970-01-01. I could not find anywhere a way to do it. as.numeric(x) where x is the Date object. Uwe Ligges Thanks, Marco -- View this message in context: http://r.789695.n4.nabble.com/Write-date-class-as-number-of-days-from-1970-tp4666155.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R CMD building SPEEDY
On 02.05.2013 05:10, ren_az wrote: Hello every one: I get following warning when building my R package with R-3.0.0. building 'SPEEDY.tar.gz' Warning in utils::tar(filepath, pkgname, compression = gzip, compression_level = 9L, : number of items to replace is not a multiple of replacement length thanks Michael I have no idea for this, can you help me. Can you show us the package? I am not able to generate a problem like this one with R-3.0.0, i.e. that R tries to create a file called SPEEDY.tar.gz without version number. Best, Uwe Ligges Best regard $BG$!!0DA(B/ Ren Aizhen r...@bi.cs.titech.ac.jp $BEl5~9)6HBg3XpJsM}9)3X85f2J!!7W;;9)3X@l96!!=);385f%I%/%?!(B4$BG/(B $B)(B152-8552$B!!L\9u6hBg2,;3(B2-12-1$B!!(BW8-76 $B!J@(B8$B9f4[(BE507$B9f(B) Tel:03-5734-3645, Fax:03-5734-3646 - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color by group in ggplot
On May 3, 2013, at 1:57 PM, Ye Lin wrote: I want to plot the values of Var1 and Var2 on the same plot, with x-axis labeling as the list of IDs. Sth like this: image.png But I want to color the points based on the category in Group, I dont know how to do it with ggplot. You didn't say what class the ID variable was, but if it were a factor ( as is most likely), then: plot( as.numeric(dfrm$ID), Var1) points( as.numeric(dfrm$ID), Var2) With whatever means of disiguishing overlapping points (pch, col, jittering) might suit you. -- David. Thanks! On Fri, May 3, 2013 at 1:49 PM, David Winsemius dwinsem...@comcast.net wrote: On May 3, 2013, at 1:37 PM, Ye Lin wrote: Hey, I have a dataset like this: ID Var1 Var2 Group A1 11BB A2 1 2AA B1 2 1 CC B2 13DD C1 12EE I would like to plot the points of Var1 and Var2, use ID as X-axis, but color the points by Group. I can only manage to color the points by ID after transform the dataset to tall using reshape package. If I were given the task of designing a plotting system that would decide what to do with a categorical x-axis request, it would probably deliver a barplot. My guess is that you do not want that. But what do you mean by a point whose x-value is A1? -- David Winsemius Alameda, CA, USA David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A problem of splitting the right screen in 3 or more independent vertical boxes:
Hi, Based on par function, I can split the screen into two parts left and right. I wish x occupies the half left screen, and all plants occupy half right screen, which happens right now. But I wish the right screen, to be split in 3 or more vertical parts where each pair of the same type of plant, are together in its own block of boxplot, because each plant has its own unit of measure. Let's say wheat is measured in ton, tomato in pound and cucumbers as counts. :-) x-rnorm(1000,mean=0,sd=1,main=Right screen) wheat1-rnorm(100,mean=0,sd=1) wheat2-rnorm(150,mean=0,sd=2) tomatos3-rnorm(200,mean=0,sd=3) tomatos4-rnorm(250,mean=0,sd=4) cucumbers5-rnorm(300,mean=0,sd=5) cucumbers6-rnorm(400,mean=0,sd=6) par(mfrow=c(1,2)) hist(x, main=Left screen OK) boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6) title (Right screen: boxplot with plants) Thank you in advance for any suggestions, Aldi -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A problem of splitting the right screen in 3 or more independent vertical boxes:
Hmm, I had a typo paste by mistake in my x vector It has to be: x-rnorm(1000,mean=0,sd=1) wheat1-rnorm(100,mean=0,sd=1) wheat2-rnorm(150,mean=0,sd=2) tomatos3-rnorm(200,mean=0,sd=3) tomatos4-rnorm(250,mean=0,sd=4) cucumbers5-rnorm(300,mean=0,sd=5) cucumbers6-rnorm(400,mean=0,sd=6) par(mfrow=c(1,2)) hist(x, main=Left screen OK) boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6) title (Right screen: boxplot with plants) Thanks, Aldi On 5/3/2013 4:46 PM, Aldi Kraja wrote: Hi, Based on par function, I can split the screen into two parts left and right. I wish x occupies the half left screen, and all plants occupy half right screen, which happens right now. But I wish the right screen, to be split in 3 or more vertical parts where each pair of the same type of plant, are together in its own block of boxplot, because each plant has its own unit of measure. Let's say wheat is measured in ton, tomato in pound and cucumbers as counts. :-) x-rnorm(1000,mean=0,sd=1,main=Right screen) wheat1-rnorm(100,mean=0,sd=1) wheat2-rnorm(150,mean=0,sd=2) tomatos3-rnorm(200,mean=0,sd=3) tomatos4-rnorm(250,mean=0,sd=4) cucumbers5-rnorm(300,mean=0,sd=5) cucumbers6-rnorm(400,mean=0,sd=6) par(mfrow=c(1,2)) hist(x, main=Left screen OK) boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6) title (Right screen: boxplot with plants) Thank you in advance for any suggestions, Aldi -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A problem of splitting the right screen in 3 or more independent vertical boxes:
Hi Aldi, You might want ?layout instead. Sarah On Fri, May 3, 2013 at 5:54 PM, Aldi Kraja a...@wustl.edu wrote: Hmm, I had a typo paste by mistake in my x vector It has to be: x-rnorm(1000,mean=0,sd=1) wheat1-rnorm(100,mean=0,sd=1) wheat2-rnorm(150,mean=0,sd=2) tomatos3-rnorm(200,mean=0,sd=3) tomatos4-rnorm(250,mean=0,sd=4) cucumbers5-rnorm(300,mean=0,sd=5) cucumbers6-rnorm(400,mean=0,sd=6) par(mfrow=c(1,2)) hist(x, main=Left screen OK) boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6) title (Right screen: boxplot with plants) Thanks, Aldi On 5/3/2013 4:46 PM, Aldi Kraja wrote: Hi, Based on par function, I can split the screen into two parts left and right. I wish x occupies the half left screen, and all plants occupy half right screen, which happens right now. But I wish the right screen, to be split in 3 or more vertical parts where each pair of the same type of plant, are together in its own block of boxplot, because each plant has its own unit of measure. Let's say wheat is measured in ton, tomato in pound and cucumbers as counts. :-) x-rnorm(1000,mean=0,sd=1,main=Right screen) wheat1-rnorm(100,mean=0,sd=1) wheat2-rnorm(150,mean=0,sd=2) tomatos3-rnorm(200,mean=0,sd=3) tomatos4-rnorm(250,mean=0,sd=4) cucumbers5-rnorm(300,mean=0,sd=5) cucumbers6-rnorm(400,mean=0,sd=6) par(mfrow=c(1,2)) hist(x, main=Left screen OK) boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6) title (Right screen: boxplot with plants) Thank you in advance for any suggestions, Aldi -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R package for bootstrapping (comparing two quadratic regression models)
Hello , I want to compare two quadratic regression models with non-parametric bootstrap. However, I do not know which R package can serve the purpose, such as boot, rms, or bootstrap, DeltaR. Please kindly advise and thank you. Elaine The two quadratic regression models are y1=a1x^2+b1x+c1 y1= observed migration distance of butterflies() y2=a2x^2+b2x+c2 y2= predicted migration distance of butterflies (based on body mass) x= body mass of butterflies null hypothesis: a1=a2 and b1=b2 and c1=c2 bootstrap to test if the coeffients (a, b, c) of the y1 and the y2 model differ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A problem of splitting the right screen in 3 or more independent vertical boxes:
On May 3, 2013, at 3:21 PM, Sarah Goslee wrote: Hi Aldi, You might want ?layout instead. Indeed. In particular a matrix argument might be: matrix(c(1,2,3, 4,4,4) Sarah On Fri, May 3, 2013 at 5:54 PM, Aldi Kraja a...@wustl.edu wrote: Hmm, I had a typo paste by mistake in my x vector It has to be: x-rnorm(1000,mean=0,sd=1) wheat1-rnorm(100,mean=0,sd=1) wheat2-rnorm(150,mean=0,sd=2) tomatos3-rnorm(200,mean=0,sd=3) tomatos4-rnorm(250,mean=0,sd=4) cucumbers5-rnorm(300,mean=0,sd=5) cucumbers6-rnorm(400,mean=0,sd=6) par(mfrow=c(1,2)) hist(x, main=Left screen OK) boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6) I think you will need a separate call to boxplot for each grouping. The `boxplot` function will nto be able to access the device specifications. -- David. title (Right screen: boxplot with plants) Thanks, Aldi On 5/3/2013 4:46 PM, Aldi Kraja wrote: Hi, Based on par function, I can split the screen into two parts left and right. I wish x occupies the half left screen, and all plants occupy half right screen, which happens right now. But I wish the right screen, to be split in 3 or more vertical parts where each pair of the same type of plant, are together in its own block of boxplot, because each plant has its own unit of measure. Let's say wheat is measured in ton, tomato in pound and cucumbers as counts. :-) x-rnorm(1000,mean=0,sd=1,main=Right screen) wheat1-rnorm(100,mean=0,sd=1) wheat2-rnorm(150,mean=0,sd=2) tomatos3-rnorm(200,mean=0,sd=3) tomatos4-rnorm(250,mean=0,sd=4) cucumbers5-rnorm(300,mean=0,sd=5) cucumbers6-rnorm(400,mean=0,sd=6) par(mfrow=c(1,2)) hist(x, main=Left screen OK) boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6) title (Right screen: boxplot with plants) Thank you in advance for any suggestions, Aldi David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to parallelize 'apply' across multiple cores on a Mac
Hi everyone, I'm trying to use apply (with a call to zoo's rollapply within) on the columns of a 1.5Kx165K matrix, and I'd like to make use of the other cores on my machine to speed it up. (And hopefully also leave more memory free: I find that after I create a big object like this, I have to save my workspace and then close and reopen R to be able to recover memory tied up by R, but maybe that's a separate issue -- if so, please let me know!) It seems the package 'multicore' has a parallel version of 'lapply', which I suppose I could combine with a 'do.call' (I think) to gather the elements of the output list into a matrix, but I was wondering whether there might be another route. And, in case the particular way I constructed the call to 'apply' might be the source of the problem, here is a deconstructed version of what I did to each column, for easier parsing: - begin call to 'apply' Step 1: Identify several disjoint subsequences of fixed length, say length three, of a column. column.values - 1:16 desired.subseqs - c( NA, NA, NA, 1, 1, 1, NA, 1, 1, 1, NA, NA, 1,1,1, NA ) # this vector is used for every column. desired.values - desired.subseq * column.values Step 2: Find the average value of each subsequence. desired.means - rollapply( desired.values, 3, mean, fill=NA, align = right, na.rm = FALSE) # put mean in highest index of subsequence and retain original vector length desired.means [1] NA NA NA NA NA 5 NA NA NA 9 NA NA NA NA NA 14 NA Step 3: Shift values forward by one index value, retaining original vector length. desired.means - zoo( desired.means ) # in order to be able to use lag.zoo desired.means - lag( desired.means, k = -1, na.pad = TRUE) desired.means [1] NA NA NA NA NA NA 5 NA NA NA 9 NA NA NA NA 14 Step 4: Use last-observation-carried-forward, retaining original vector length. desired.means - na.locf( desired.means, na.rm = FALSE ) desired.means [1] NA NA NA NA NA NA 5 5 5 5 9 9 9 9 9 14 Step 5: Use next-observation-carried-backward to assign values to initial sequence of NAs. desired.means - na.locf( desired.means, fromLast = TRUE) desired.means [1] 5 5 5 5 5 5 5 5 5 5 9 9 9 9 9 14 Step 6: Convert back to vector (from zoo object), and subtract from column. desired.column - vector.values - coredata(desired.means) desired.column [1] -4 -3 -2 -1 0 1 2 3 4 5 2 3 4 5 6 2 - end call to 'apply' Thanks, David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to best add columns to a matrix with many columns
Hi everyone, I have large data frame, say df1, with 165K columns, and all but the first four columns of df1 are numeric. I transformed the numeric data and obtained a matrix, call it data.m, with 165K - 4 columns, and then tried to create a second data frame by replacing the numeric columns of df1 by data.m. I did this in two ways, and both ways instantly used up all the available memory, so I was wondering whether there was a better way to do this. Here's what I tried: df2 - df1 df2[ ,5:length(df1)] - data.m and df2 - cbind( df1[1:4], data.m) Thanks, David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mean for each observation
HI, Not sure I understand it correctly. dat1- read.table(text= site Year doy fish Feed swim agr_1 agr_2 agr_3 rest hide 3 2012 203 1 1 0 0 0 0 0 0 3 2012 203 1 0 1 0 0 0 0 0 3 2012 203 1 0 1 0 0 0 0 0 3 2012 203 2 0 0 0 0 0 1 0 3 2012 203 2 1 0 0 0 0 0 0 3 2012 203 2 1 0 0 0 0 0 0 4 2012 197 1 0 0 0 0 0 1 0 4 2012 197 1 1 0 0 0 0 0 0 4 2012 197 1 0 1 0 0 0 0 0 4 2012 197 3 0 0 0 0 0 0 1 4 2012 197 3 1 0 0 0 0 0 0 ,sep=,header=TRUE) dat2-reshape(dat1,direction=long,varying=7:9,sep=_) row.names(dat2)- 1:nrow(dat2) head(dat2) # site Year doy fish Feed swim rest hide time agr id #1 3 2012 203 1 1 0 0 0 1 0 1 #2 3 2012 203 1 0 1 0 0 1 0 2 #3 3 2012 203 1 0 1 0 0 1 0 3 #4 3 2012 203 2 0 0 1 0 1 0 4 #5 3 2012 203 2 1 0 0 0 1 0 5 #6 3 2012 203 2 1 0 0 0 1 0 6 library(plyr) #fish, year, site ddply(dat2,.(fish,Year,site),function(x) numcolwise(mean)(x[,c(5:8)])) # fish Year site Feed swim rest hide #1 1 2012 3 0.333 0.667 0.000 0.0 #2 1 2012 4 0.333 0.333 0.333 0.0 #3 2 2012 3 0.667 0.000 0.333 0.0 #4 3 2012 4 0.500 0.000 0.000 0.5 #fish ddply(dat2,.(fish),function(x) numcolwise(mean)(x[,c(5:8)])) # fish Feed swim rest hide #1 1 0.333 0.5 0.167 0.0 #2 2 0.667 0.0 0.333 0.0 #3 3 0.500 0.0 0.000 0.5 A.K. Hi I did fish behavior at different sites. Each fish represent a rep at each site. e.g for my data site Yeardoy fishFeedswimagr_1 agr_2 agr_3 rest hide 3 2012203 1 1 0 0 0 0 0 0 3 2012203 1 0 1 0 0 0 0 0 3 2012203 1 0 1 0 0 0 0 0 3 2012203 2 0 0 0 0 0 1 0 3 2012203 2 1 0 0 0 0 0 0 3 2012203 2 1 0 0 0 0 0 0 4 2012197 1 0 0 0 0 0 1 0 4 2012197 1 1 0 0 0 0 0 0 4 2012197 1 0 1 0 0 0 0 0 4 2012197 3 0 0 0 0 0 0 1 4 2012197 3 1 0 0 0 0 0 0 1. I would like to combine column agr_1, agr_2 and agr_3 2. How to calculate mean for each fish for each behavior Any suggestion is appreciated. Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to best add columns to a matrix with many columns
I am not seeing any good justification in your description for converting to matrix if you are planning to convert it back to data frame. Memory is going to be inefficiently-used if you do this. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. David Romano drom...@stanford.edu wrote: Hi everyone, I have large data frame, say df1, with 165K columns, and all but the first four columns of df1 are numeric. I transformed the numeric data and obtained a matrix, call it data.m, with 165K - 4 columns, and then tried to create a second data frame by replacing the numeric columns of df1 by data.m. I did this in two ways, and both ways instantly used up all the available memory, so I was wondering whether there was a better way to do this. Here's what I tried: df2 - df1 df2[ ,5:length(df1)] - data.m and df2 - cbind( df1[1:4], data.m) Thanks, David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating distance matrix for large dataset
I have a version that uses bigmemory on my blog, but looks at distance on a sphere for a 36k * 36K matrix not hundreds of Gb so I dont know if the approach will work for you http://stevemosher.wordpress.com/2012/04/12/nick-stokes-distance-code-now-with-big-memory/ Steve However, I never tested it with On May 2, 2013 9:40 PM, HJ YAN yhj...@googlemail.com wrote: Dear R users I wondered if any of you ever tried to calculate distance matrix with very large data set, and if anyone out there can confirm this error message I got actually mean that my data is too large for this task. negative length vectors are not allowed My data size and code used dim(mydata_nor)[1] 365000144 d - dist(mydata_nor, method = euclidean) Here my data has 1000 samples each has a year data observed by 10 minutes interval daily, so the size is (365* 1000) * 144. I checked the manual of function 'dist' but can not see the upper limit size allowed, and I bet there should be one, so any hints is appreciated. I would also be grateful if any other method for calculating distance matrix for large dataset could be advised. I appreciate reproducible code should be provided for your advice, so try below if needed: A-matrix(1:365000*144,nrow=365000,ncol=144) dim(A)[1] 365000144 d1-dist(A,method=euclidean)Error in dist(A, method = euclidean) : negative length vectors are not allowed Many thanks in advance! HJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R 2.15.2 Failed to load sRGB colorspace file
Hello, I built R 2.15.2 on Solaris X64, I have an issue when trying to execute the check target to test if everything goes ok. Do you have any idea what could be causing this issue? Error message: Examples/tools-Ex.Rout.fail cat(Time elapsed: , proc.time() - get(ptime, pos = 'CheckExEnv'),\n) Time elapsed: 1.483 0.045 2.195 0 0 grDevices::dev.off() Error in grDevices::dev.off() : Failed to load sRGB colorspace file Execution halted Thanks for your help, Humberto. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Factor deletion criteria
Hi, I would like to know the criteria by which R removes a factor in linear models. For example, I have a four level factor, and R creates 3 dummies to estimate coefficients. Which level is chosen? Can I chance it? Thanks, Iuri [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to best add columns to a matrix with many columns
Sorry, Jeff, I misspoke: the 'matrix' data.m is really a data frame -- I was just thinking about it as a matrix since it's the numeric part of df1, and didn't realize the thought made it's way in the message. So the memory issues are unrelated to converting between data frames and matrices. -David On Fri, May 3, 2013 at 8:20 PM, Jeff Newmiller jdnew...@dcn.davis.ca.uswrote: I am not seeing any good justification in your description for converting to matrix if you are planning to convert it back to data frame. Memory is going to be inefficiently-used if you do this. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. David Romano drom...@stanford.edu wrote: Hi everyone, I have large data frame, say df1, with 165K columns, and all but the first four columns of df1 are numeric. I transformed the numeric data and obtained a matrix, call it data.m, with 165K - 4 columns, and then tried to create a second data frame by replacing the numeric columns of df1 by data.m. I did this in two ways, and both ways instantly used up all the available memory, so I was wondering whether there was a better way to do this. Here's what I tried: df2 - df1 df2[ ,5:length(df1)] - data.m and df2 - cbind( df1[1:4], data.m) Thanks, David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Factor deletion criteria
On May 3, 2013, at 3:32 PM, Iuri Gavronski wrote: Hi, I would like to know the criteria by which R removes a factor in linear models. For example, I have a four level factor, and R creates 3 dummies to estimate coefficients. Which level is chosen? Can I chance it? The default order is alphabetical. Lowest lexical sorted item is the reference level. Changing levels is possible: ?levels ?factor -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read .csv file and plot a graph
On 05/03/2013 11:49 PM, Vahe nr wrote: Hi all, I have a big .csv file (21Mb with 100 rows) it has this shape: x 1 NaN 2 NaN 3 0.23 and so on. So the first column has x as a header then row number, the second column contains values between -1,1 and NaN for empty values. What should I need to do is: create a new .csv file from this one excluding NaN values and plot a line graph using the new .csv file. Or can I use the old .csv file to plot a graph excluding NaN values. Hi Vahe, If you want to plot the line ignoring the NaN values, rather than having the line break at each NaN, use this: vndat-data.frame(1:10, x=c(-1,-0.6,-0.4,NaN,-0.2,0.2,0.4,NaN,0.6,0.8)) plot(vndat$x[complete.cases(vndat$x)],type=l) Jim (the other one) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.