Re: [R] Controlling number of numbers before R rewrites to +e18 etc
Thanks Jim, but I still got the problem that the pre-processing becomes way too computationally expensive. R seems to handle characters and factors much much worse than numeric IDs. I don't have enough RAM to even write the file when they are viewed as chars instead of numeric values! Anyone have any other ideas? Is it not possible to tell R not to rewrite upon import? It wouldn't matter if it only would write the correct IDs to the exported csv file, but it exports the abbreviated version which is of no use. Mike On Sat, Oct 23, 2010 at 3:56 AM, jim holtman jholt...@gmail.com wrote: Your best bet is to make sure that you read the IDs in as characters. If they are being read in as floating point numbers, then there is only 15 digits of accuracy, so if you have IDs 18-22 digits, you will be missing data. So if you are using read.table, then look at colClasses to see how to do this. Provide a subset of your data and the statements that you are using to read in the data. On Fri, Oct 22, 2010 at 1:15 PM, ZeMajik zema...@gmail.com wrote: Hey, I'm using R as a pre-processor for a large dataset with IDs which are numeric (but has no numeric meaning so can be seen as factors). I do some data formating and then write it out to a csv file. However the problem is that the IDs are very long, 18-22 chars long more precisely. R is constantly rewriting these IDs to the abbreviated +eX which hinders me from exporting the data to the csv since the IDs are no longer intact. I've tried telling R that ID column is a factor, but this results in two problems: 1) Since I have millions of rows and R is slower handling factors than numbers my comp can't run the process in any kind of reasonable time. and 2) Some IDs STILL seem to be rewritten somehow. The second point made me believe that perhaps R is rewriting upon import? Does anyone have any tips on how to solve this problem? Thanks, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Controlling number of numbers before R rewrites to +e18 etc
Hey, I'm using R as a pre-processor for a large dataset with IDs which are numeric (but has no numeric meaning so can be seen as factors). I do some data formating and then write it out to a csv file. However the problem is that the IDs are very long, 18-22 chars long more precisely. R is constantly rewriting these IDs to the abbreviated +eX which hinders me from exporting the data to the csv since the IDs are no longer intact. I've tried telling R that ID column is a factor, but this results in two problems: 1) Since I have millions of rows and R is slower handling factors than numbers my comp can't run the process in any kind of reasonable time. and 2) Some IDs STILL seem to be rewritten somehow. The second point made me believe that perhaps R is rewriting upon import? Does anyone have any tips on how to solve this problem? Thanks, Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating table from data frame
That's golden Henrique, thanks a lot! Worked like a charm even with large datasets. On Tue, Sep 21, 2010 at 2:56 PM, Henrique Dallazuanna www...@gmail.comwrote: Try this: d - data.frame(A = letters[1:10], B = sample(letters[11:20]), C = sample(10)) xtabs(C ~ A + B, d) On Tue, Sep 21, 2010 at 8:39 AM, ZeMajik zema...@gmail.com wrote: Hey, I have a dataset where two columns are factors and another column consists of values. Each combination of factors can only have a single value assigned to it. I'd like to represent this as a matrix or table where the rows are the first column factors and the columns the second column factors. So that each cell a_ij in the matrix represents the associated value for the factor combination ij. When no such value exists for the combination the value should be 0. I've tried playing around with tables to get this to work, but I can't seem to get it right. I've also had little luck when trying to find a solution to this. Any help would be much appreciated! Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating table from data frame
Hey, I have a dataset where two columns are factors and another column consists of values. Each combination of factors can only have a single value assigned to it. I'd like to represent this as a matrix or table where the rows are the first column factors and the columns the second column factors. So that each cell a_ij in the matrix represents the associated value for the factor combination ij. When no such value exists for the combination the value should be 0. I've tried playing around with tables to get this to work, but I can't seem to get it right. I've also had little luck when trying to find a solution to this. Any help would be much appreciated! Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dividing one column of form xx-yy into two columns, xx and yy
I have a data set where one column consists of two numerical factors, separated by a -. So my data looks something like this: 43-156 43-43 1267-18 . . . There are additional columns consisting of single factors as well, so reading the csv file (where the data is stored) with the sep=- addition won't work since the rest of the factors are separated by commas. So first of all, is there any way to import a file which is separated by , OR -? If this is not possible, does anyone have any ideas how I could go about to separate these? I could use a text editor to replace the - with , and import, but I would prefer doing this inside of R so that making a script could be used in the future. Just to clarify, I would like the above to turn out as two separate columns (or vectors) where the first in this would be (43,43,1267,) and the second (156,43,18,.) The dataset is rather large, with a few hundred thousand lines, so it would be preferable to keep resource intensive methods to a minimum if possible. Thanks in advance! Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R-squared for ANOVA model
Hey, In SAS after doing an ANOVA you also get the aptness of fit as R-squared in the ANOVA table. I've tried getting the R-squared from R but have failed miserably. I've tried searching google and the manual as well Is it possible to extract R-squared from ANOVA models in R? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Treating variables as symbols
Hey, I'm trying to find how to perform operations with a variable treated as a symbol. For, an extremely simple, example I want to integrate a*x with respect to x and I want to find the indefinite integral of this, (a*x^2/2), or the definite integral with some interval for x. Another example of such a use would be to create a function y-function(x) {a*x} and by typig y(2) I would get the result 2*a Is there a way to treat variables as merely symbols? Any help much appreciated -M [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Treating variables as symbols
Thanks guys, Ryacas is pretty much what I'm looking for!However, I can't seem to get it to work properly. For example: yacas(Integrate(x) x) Error in parse(text = text, srcfile = NULL) : unexpected numeric constant in / (^ (x ,2 2 Same thing with expressions such as yacas(x*x) However yacas(2*2) expression(4) So it seems there is a successful connection between yacas and R. I didn't find any info on the problem by googling it unfortunately! Any ideas what it might be? Thanks again, M On Thu, Oct 1, 2009 at 3:50 PM, Jorge Ivan Velez jorgeivanve...@gmail.comwrote: Hi Zemajik, Try this: y - function(a) paste(a, '*x', sep=) y(2) [1] 2*x Also, take a look at the Ryacas package. HTH, Jorge On Thu, Oct 1, 2009 at 9:46 AM, ZeMajik wrote: Hey, I'm trying to find how to perform operations with a variable treated as a symbol. For, an extremely simple, example I want to integrate a*x with respect to x and I want to find the indefinite integral of this, (a*x^2/2), or the definite integral with some interval for x. Another example of such a use would be to create a function y-function(x) {a*x} and by typig y(2) I would get the result 2*a Is there a way to treat variables as merely symbols? Any help much appreciated -M [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Supressing the enumeration of output in console
Hi! Pretty low content question but I've had major trouble finding an answer for it so I hope it's alright. I'm obviously new to R, and have been trying to get rid of the numerated output I get in the console. What I mean more specifically is that X-4;X comes out as [1] 4 and I'd like to get rid of the listing [1]. This isn't usually a problem when working with the console but when writing scripts that print out lines of text it gives a rather unattractive output in the console. Thanks in advance! Mike [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.