Re: [R] Controlling number of numbers before R rewrites to +e18 etc

2010-10-25 Thread ZeMajik
Thanks Jim, but I still got the problem that the pre-processing becomes way too computationally expensive. R seems to handle characters and factors much much worse than numeric IDs. I don't have enough RAM to even write the file when they are viewed as chars instead of numeric values! Anyone have

Re: [R] Controlling number of numbers before R rewrites to +e18 etc

2010-10-25 Thread jim holtman
You can always read a portion of the file and then write it out. For large files, I will read in 10,000 line, fix them up and then write them out and go back and process the next batch of lines. You haven't shown us what a sample of your input/output is, or how you are processing them.

[R] Controlling number of numbers before R rewrites to +e18 etc

2010-10-22 Thread ZeMajik
Hey, I'm using R as a pre-processor for a large dataset with IDs which are numeric (but has no numeric meaning so can be seen as factors). I do some data formating and then write it out to a csv file. However the problem is that the IDs are very long, 18-22 chars long more precisely. R is

Re: [R] Controlling number of numbers before R rewrites to +e18 etc

2010-10-22 Thread jim holtman
Your best bet is to make sure that you read the IDs in as characters. If they are being read in as floating point numbers, then there is only 15 digits of accuracy, so if you have IDs 18-22 digits, you will be missing data. So if you are using read.table, then look at colClasses to see how to do